Open DayTimeMouse opened 1 month ago
Hi I think the 2nd approach "Coordinate Conversion After Variant Calling" is more reasonable. Regarding the SVs that failed to convert, did you try the "CrossMap region" command? This command allows fuzzy converts (i.e., regions from the input assembly do NOT have to be 100% mapped to the target).
Please keep in mind that the conversion ratio is largely determined by the chain file.
Liguo
On Thu, Oct 17, 2024 at 9:31 PM DayTimeMouse @.***> wrote:
Hi,
Thanks for developing this nice tool.
I have two genomes, assembly1 and assembly2. My goal is to use these two genomes as reference genomes, align reads to each of them, and call variants. To effectively compare the variants obtained from the two genomes and identify their similarities and differences, I am considering two approaches:
Coordinate Conversion Before Variant Calling: Should I first convert the coordinates between the two genome assemblies before calling variants, and then perform the comparison? Coordinate Conversion After Variant Calling: Alternatively, should I proceed without converting the genome assembly coordinates initially, call variants separately for each genome, and then convert the coordinates of the resulting VCF files before performing the comparison? Which of these methods would be more reasonable?
When using the second method (coordinate conversion after variant calling), I have encountered an issue where many structural variations (SVs) are reported as unmap. Could you please provide some advice on how to address this problem?
Best regards.
— Reply to this email directly, view it on GitHub https://github.com/liguowang/CrossMap/issues/78, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN443UQ3DVDSQD3N5CGQCDZ4BXHDAVCNFSM6AAAAABQE7HQ56VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU4TMMRTGEZTMMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi liguowang,
I used CrossMap vcf
to convert genome coordinate, but REF base is changed, like ID.2 original is G, after converting is C, there are many cases like this, is it right?
original:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample
chr1 414779 ID.2 G A 15.5 PASS . GT:GQ:DP:AD:VAF:PL 1/1:12:43:28,15:0.348837:15,15,0
chr1 416197 ID.3 A C 25.3 PASS . GT:GQ:DP:AD:VAF:PL 1/1:24:45:28,17:0.377778:25,32,0
chr1 895954 ID.4 G T 17.1 PASS . GT:GQ:DP:AD:VAF:PL 1/1:6:41:24,17:0.414634:15,5,0
chr1 946700 ID.5 G T 28 PASS . GT:GQ:DP:AD:VAF:PL 1/1:27:51:27,24:0.470588:28,34,0
chr1 1069530 ID.6 C T 36.1 PASS . GT:GQ:DP:AD:VAF:PL 1/1:34:55:32,23:0.418182:36,38,0
chr1 1343590 ID.7 G A 11.5 PASS . GT:GQ:DP:AD:VAF:PL 1/1:4:40:19,21:0.525:9,3,0
chr1 1484250 ID.8 C A 16.1 PASS . GT:GQ:DP:AD:VAF:PL 1/1:10:49:28,21:0.428571:15,11,0
after converting:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample
chr1 2216210 ID.2 C A 15.5 PASS . GT:GQ:DP:AD:VAF:PL 1/1:12:43:28,15:0.348837:15,15,0
chr1 2217628 ID.3 C C 25.3 PASS . GT:GQ:DP:AD:VAF:PL 1/1:24:45:28,17:0.377778:25,32,0
chr1 2674841 ID.4 A T 17.1 PASS . GT:GQ:DP:AD:VAF:PL 1/1:6:41:24,17:0.414634:15,5,0
chr1 2725591 ID.5 C T 28 PASS . GT:GQ:DP:AD:VAF:PL 1/1:27:51:27,24:0.470588:28,34,0
chr1 2848747 ID.6 T T 36.1 PASS . GT:GQ:DP:AD:VAF:PL 1/1:34:55:32,23:0.418182:36,38,0
chr1 3122698 ID.7 A A 11.5 PASS . GT:GQ:DP:AD:VAF:PL 1/1:4:40:19,21:0.525:9,3,0
chr1 3263515 ID.8 T A 16.1 PASS . GT:GQ:DP:AD:VAF:PL
Best regards.
Hi,
Thanks for developing this nice tool.
I have two genomes, assembly1 and assembly2. My goal is to use these two genomes as reference genomes, align reads to each of them, and call variants. To effectively compare the variants obtained from the two genomes and identify their similarities and differences, I am considering two approaches:
Coordinate Conversion Before Variant Calling: Should I first convert the coordinates between the two genome assemblies before calling variants, and then perform the comparison? Coordinate Conversion After Variant Calling: Alternatively, should I proceed without converting the genome assembly coordinates initially, call variants separately for each genome, and then convert the coordinates of the resulting VCF files before performing the comparison?
Which of these methods would be more reasonable?
When using the second method (coordinate conversion after variant calling), I have encountered an issue where many structural variations (SVs) are reported as unmap. Could you please provide some advice on how to address this problem?
Best regards.