edgardomortiz / vcf2phylip

Convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis
GNU General Public License v3.0
294 stars 85 forks source link

Error from vcf2phylip v2.3 when converting GATK vcf #27

Closed seanimal closed 4 years ago

seanimal commented 4 years ago

Hello. I got the following error when converting GATK vcf to phylip using vcf2phylip v2.3. Typed command: ./vcf2phylip.py -i input.vcf -r

Traceback (most recent call last): File "./vcf2phylip.py", line 447, in main() File “./vcf2phylip.py", line 65, in main outgroup = args.outgroup.split(",").split(";")[0] AttributeError: 'list' object has no attribute 'split'

Could you give me any solutions? It didn’t happen when I used v2.0 instead of v2.3, but I would like to apply “-r, --resolve-IUPAC” option in v2.3.

In addition to that, I would like to know how genotype is determined in the case of hetero using “-r, --resolve-IUPAC” option in this tool.

Thank you very much.

edgardomortiz commented 4 years ago

Try now (please re-clone the repository), I had introduced a silly bug with a previous safeguard I wrote for just allowing a single outgroup.

Edgardo

seanimal commented 4 years ago

It worked well. Thank you very much.

It may not appropriate to ask here, I have a question. How does this tool choose either one from the heterogeny site(0/1, 0/2, 1/2…) when using “-r, --resolve-IUPAC” option?

edgardomortiz commented 4 years ago

It chooses at random:

  -r, --resolve-IUPAC   Randomly resolve heterozygous genotypes to avoid IUPAC
                        ambiguities in the matrices
edgardomortiz commented 4 years ago

I was thinking to add another option to resolve according to the reference allele. Is that the behavior you were expecting?

seanimal commented 4 years ago

I was overlooking it. Thank you for your kindness.

Yes. Since different phylogeny was inferred from IUPAC and random genotypes in my study, I'm seeking a better way to convert SNPs into a sequence.

edgardomortiz commented 4 years ago

Just to be clear, would you benefit from an option allowing you to resolve a heterozygote according to the REF allele instead of choosing randomly? or the random selection is good for your purposes?

seanimal commented 4 years ago

I seemed to have misunderstood. Your option is different for my expectation. I’m so sorry and thank you for your kind proposal.

My purpose is to construct sequences from SNPs in resequenced samples and then inferring phylogeny. I guess the difference in IUPAC and random was caused by false heterogeny because both methods accepts the error.
So I would like to use some criteria to choose a more plausible one from a heterogeny site. But I don’t have good criteria yet. I think that IUPAC is more pragmatic and reasonable than random for my study. Your tool is easier and faster than other tools, I use it.

I appreciate you.

edgardomortiz commented 4 years ago

I will close the issue now, please feel free to re-open it