edgardomortiz / vcf2phylip

Convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis
GNU General Public License v3.0
294 stars 85 forks source link

Phylogenetic tree #28

Closed devenderarora closed 3 years ago

devenderarora commented 4 years ago

Dear Sir, Any lead how can we import .py generated file in RAxML, IQTREE, and MrBayes? I am not able to do it successfully. Is there any example for the the same? Thanks Devender Arora

edgardomortiz commented 4 years ago

vcf2phylip only transforms VCF files into a matrix (.phy, .fasta, .nexus) than then can be used in any phylogenetic inference software. I guess this is a typo, but what do you mean by importing the .py generated?, vcf2phylip generates matrices with extensions .nex, .fasta, or .phy For example running a PHYLIP (.phy) matrix of SNPs in IQ-TREE2 would be some like this: iqtree2 -s example.phy -st DNA -m GTR+ASC

What commands did you try that were not working?

Edgardo

devenderarora commented 4 years ago

I tried iqtree -s my_file.phy and it ended up with failed composition chi2 test. The phy file look like composed of amino acid sequence do I need to add a tag -st PROTEIN?

edgardomortiz commented 4 years ago

But you have nucleotides, not aminoacids (am I right?), try using -st DNA

devenderarora commented 4 years ago

Yes, it is nucleotides. I crossed check and re-run the file again with -st DNA.

splaisan commented 2 years ago

If this can help others, I had a very similar issue coming from multiallelic calls resulting in IUPAC letters. I ran the vcf2phylip command again with -r to resolve multiallelic calls to a randomly chosen base and removed non-SNP calls from my input to simplify the alignment problem. I also required that all my samples be genotyped using -m <smpl#> IQ tree now computes all possible model combinations but at least only reported 1 fallout sample where all failed before

FionaMoon commented 1 year ago

Do you mean the sequence like this?

$ cat Pt-12_ALL.min3.phy  | less -S
3 376745
SRR10874860   RWMRRYYYRRRSRRSKRRMRSKRYRSSKMMRRYSSYWKRRYYYRSMYYYSRRWRRRRYKSRRYSRRYRKRMRRRR
SRR10874861   RWMRRYYYRRRSRRSKRRMRSKRYRSSKMMRRYSSYWKRRYYYRSMYYYSRRWRRRRYKSRRYSRRYRKRMRRRR
SRR10874859   RWMRRYYYRRRSRRSKRRMRSKRYRSSKMMRRYSSYWKRRYYYRSMYYYSRRWRRRRYKSRRYSRRYRKRMRRRR

I ran gatk first for my sample then converted vcf.gz to .phy by vcf2phylip. The result seems quite strange.