Open gotero opened 9 years ago
Glen,
Is there a way to create an alignment with parsnp/harvesttools that includes the unaligned sequences in >addition to the core sequences?
Good question, while creating an output file of core sequences + unaligned regions per genome is not supported, '-u' will output all unaligned regions per genome. This, in addition to the aligned regions contained in the XMFA file, will produce a full representation of the sequence contained in each genome.
for example, my reference genome shrinks from 4043846 to 4023750, as do the rest of the aligned >genomes. Those missing bases throw off the annotation results from the gwas on the snps since I'm >using the reference genome's genbank file.
I'm not sure I fully understand. Assuming you are using an existing annotation for the reference, you will still be able to look up the gene annotations per each SNP despite the 21kbp unaligned. The XMFA file provides genome-specific coordinates of all aligned regions and the VCF file will list the reference-based coordinates of the position of each SNP. Neither of these files/positions will be affected by the unaligned sequence, unless you are trying to annotate the alignment and/or SNPs contained within the unaligned region(s).
Hi-
Is there a way to create an alignment with parsnp/harvesttools that includes the unaligned sequences in addition to the core sequences? I have 99% coverage in my genome alignments but there is still about 21k bp omitted from the alignment when creating the xmfa or multi-fasta alignment file. For example, my reference genome shrinks from 4043846 to 4023750, as do the rest of the aligned genomes. Those missing bases throw off the annotation results from the gwas on the snps since I'm using the reference genome's genbank file.
Suggestions?
Thanks!
Glen