justin-lack / Drosophila-Genome-Nexus

The Drosophila Genome Nexus is a repository of more than 600 Drosophila melanogaster genomes from Africa, Europe, and North America. All scripts and pipeline commands used to generate these genome sequences are available here.
7 stars 5 forks source link

VCF_to_Seq_diploid.pl trimming error #1

Closed peterdfields closed 9 years ago

peterdfields commented 9 years ago

In using the pipeline everything looks fine until I arrive at this step. While the VCF files look okay and have the correct reference length, when I run the VCF_to_Seq_diploid.pl script the resulting sequence files are severely shortened, from ~ 160kb down to 20kb or less in some cases. Using vcf-consensus on the updated reference results in the correct output length, though the script I guess is doing quite a lot more concerning masking and so forth. I can provide sample vcf files if that would help determine what may be going wrong.

justin-lack commented 9 years ago

Please send me an example vcf (the input for the script) as well as a sample of the script you are using as input into the IndelShift.pl script. Also, are you running this on a different taxon than D. melanogaster? If so, which taxon and which reference are you mapping to?

Justin Lack Ph.D. Laboratory of Genetics University of Wisconsin 425-G Henry Mall Madison, WI 53706 jlack@wisc.edu 405-314-4356

On Jul 13, 2015, at 4:13 AM, peterdfields notifications@github.com wrote:

In using the pipeline everything looks fine until I arrive at this step. While the VCF files look okay and have the correct reference length, when I run the VCF_to_Seq_diploid.pl script the resulting sequence files are severely shortened, from ~ 160kb down to 20kb or less in some cases. Using vcf-consensus on the updated reference results in the correct output length, though the script I guess is doing quite a lot more concerning masking and so forth. I can provide sample vcf files if that would help determine what may be going wrong.

— Reply to this email directly or view it on GitHub https://github.com/justin-lack/Drosophila-Genome-Nexus/issues/1.