chasewnelson / SNPGenie

Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data
GNU General Public License v3.0
100 stars 37 forks source link

Two products have the same starting position, causing an error. #6

Closed peterdfields closed 8 years ago

peterdfields commented 8 years ago

Hi,

I'm trying to run snpgenie-vcf2revcom.pl on a vcf file generated using GATK HaplotypeCaller, a fasta reference, and a gtf file generated using snpgenie-gff2gtf.pl. When I try to run the script I get the following error:

seq length is 129543483

Two products have the same starting position, causing an error. Please contact script author for a revision.

Please let me know what info would be needed to help trouble shoot this error.

singing-scientist commented 8 years ago

Hello,

Thanks very much for contacting me regarding SNPGenie. I'd be happy to troubleshoot this for you. It would be easiest is you could supply a representative SNP report (partial is OK, so long as all variant types are represented) with corresponding GFF and FASTA. It may be that I have not configured SNPGenie to handle multiple ORFs (genes) starting at the same site, and left this warning as a note-to-self. Do let me know—

Best, Chase

On Thu, Jul 7, 2016 at 5:36 PM, peterdfields notifications@github.com wrote:

Hi,

I'm trying to run snpgenie-vcf2revcom.pl on a vcf file generated using GATK HaplotypeCaller, a fasta reference, and a gtf file generated using snpgenie-gff2gtf.pl. When I try to run the script I get the following error:

seq length is 129543483

Two products have the same starting position, causing an error. Please contact script author for a revision.

Please let me know what info would be needed to help trouble shoot this error.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hugheslab/snpgenie/issues/6, or mute the thread https://github.com/notifications/unsubscribe/AMAQdCbbIclW4ZxfLj8fxX10djjBpAW9ks5qTXFHgaJpZM4JHhhM .

Chase W. Nelson http://www.chasewnelson.com PhDc, Bioinformatics & Molecular Evolution, University of South Carolina http://www.biol.sc.edu/graduate_student/nelson Graduate Research Fellow, National Science Foundation http://www.nsfgrfp.org/ Voice / Saxophone / Piano, www.ChaseWNelson.com

"Everybody thinks of changing humanity and nobody thinks of changing himself." —Leo Tolstoy

peterdfields commented 8 years ago

So a vcf in this case?

singing-scientist commented 8 years ago

Absolutely, vcf is one possible SNP report format.

C

On Thu, Jul 7, 2016 at 11:41 PM, peterdfields notifications@github.com wrote:

So a vcf in this case?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hugheslab/snpgenie/issues/6#issuecomment-231268105, or mute the thread https://github.com/notifications/unsubscribe/AMAQdFdoYc-sqNRolu-IQLMkuL6XQ1IMks5qTcbxgaJpZM4JHhhM .

Chase W. Nelson http://www.chasewnelson.com PhDc, Bioinformatics & Molecular Evolution, University of South Carolina http://www.biol.sc.edu/graduate_student/nelson Graduate Research Fellow, National Science Foundation http://www.nsfgrfp.org/ Voice / Saxophone / Piano, www.ChaseWNelson.com

"Everybody thinks of changing humanity and nobody thinks of changing himself." —Leo Tolstoy

peterdfields commented 8 years ago

Hi Chase,

You can download the necessary files using the following link: https://db.tt/pJLcgiqD

Please let me know if you have any questions about the file.

singing-scientist commented 8 years ago

Thanks very much, Peter. Unfortunately, all input files must conform to the specifications laid out in the SNPGenie documentation here at GitHub. This includes a single (assembled genome) sequence per reference FASTA, with SNPs called relative to that sequence. Please let me know if you have any questions following consulting the documentation, and whether you think SNPGenie will still be a tool appropriate for your dataset.

peterdfields commented 8 years ago

Ah, yes, sorry about that. I suppose I could try breaking the vcf, gff, and fasta file by scaffold and try something iteratively. Thank you again for your responses.

singing-scientist commented 8 years ago

Certainly. Please don't hesitate to write if I can answer any questions. If you do decide to take such a divide-and-conquer approach, I'd be happy to ensure that your VCF style is supported.