neherlab / treetime

Maximum likelihood inference of time stamped phylogenies and ancestral reconstruction
MIT License
224 stars 55 forks source link

More than one record found in handle #247

Open ashotmarg opened 1 year ago

ashotmarg commented 1 year ago

Hi,

Thanks a lot for creating treetime, looks like a very useful tool and I wanted to run this for reconstructing the ancestral sequences of my yeast genomes.

I run: treetime ancestral --aln myVCF.vcf.gz --vcf-reference R64-1-1.genome.fa --outdir testOut --tree myTree.newick

but I am getting an error "ValueError: More than one record found in handle" from SeqIO.

After quick googling, this page came up which seems to match the error I am getting. Do I understand correctly, that this because my reference fasta file has multiple chromosomes? I am using the R64-1-1 yeast reference genome. If this is the case, do you know how to fix/bypass this error?

Thanks in advance! Ashot

rneher commented 1 year ago

yes, treetime doesn't handle VCF files with multiple chromosomes at the moment. The only quick fix I could think of is to concatenate the chromosomes and transform the positions in the VCF accordingly -- sorry.

ashotmarg commented 1 year ago

Thanks for the feedback @rneher!

I was thinking/hoping of simply extracting e.g., chromosome-1 from both the fasta and the vcf, and run only for that chromosome-1 (then of course loop over the remaining chromosomes). Unfortunately, it didn't work either... Do you think it's the vcf header related?

rneher commented 1 year ago

did you get the same error? or a different one. If your fasta sequence loads in SeqIO.read, then it should work unless there is a different problem.

ashotmarg commented 1 year ago

I believe it was a different error, unfortunately cannot check it now, but will get back next week! Thanks.