neherlab / treetime

Maximum likelihood inference of time stamped phylogenies and ancestral reconstruction
MIT License
222 stars 55 forks source link

ERROR: Cannot read metadata: need at least one column that contains the taxon labels. #264

Closed timeresistance1996 closed 6 months ago

timeresistance1996 commented 7 months ago

Hi, I'm going to use treetime but I run out with this ERROR:

Attempting to parse dates... Traceback (most recent call last): File "/root/anaconda3/bin/treetime", line 8, in sys.exit(main()) File "/root/anaconda3/lib/python3.8/site-packages/treetime/main.py", line 21, in main return_code = params.func(params) File "/root/anaconda3/lib/python3.8/site-packages/treetime/argument_parser.py", line 237, in toplevel timetree(params) File "/root/anaconda3/lib/python3.8/site-packages/treetime/wrappers.py", line 306, in timetree dates = utils.parse_dates(params.dates, date_col=params.date_column, name_col=params.name_column) File "/root/anaconda3/lib/python3.8/site-packages/treetime/utils.py", line 336, in parse_dates raise err File "/root/anaconda3/lib/python3.8/site-packages/treetime/utils.py", line 285, in parse_dates raise MissingDataError("ERROR: Cannot read metadata: need at least one column that contains the taxon labels." treetime.MissingDataError: ERROR: Cannot read metadata: need at least one column that contains the taxon labels. Looking for the first column that contains 'name', 'strain', or 'accession' in the header.

This happens also when I test with your test file [ebola] (https://github.com/neherlab/treetime_examples/tree/master/data/ebola).

How can I deal with it?
corneliusroemer commented 7 months ago

Thanks for reporting! I'm happy to help see what's up. Can you share the full command you used to run on the ebola example that resulted in the error?

On Mon, Jan 29, 2024, 09:42 timeresistance1996 @.***> wrote:

Hi, I'm going to use treetime but I run out with this ERROR:

Attempting to parse dates... Traceback (most recent call last): File "/root/anaconda3/bin/treetime", line 8, in sys.exit(main()) File "/root/anaconda3/lib/python3.8/site-packages/treetime/main.py", line 21, in main return_code = params.func(params) File "/root/anaconda3/lib/python3.8/site-packages/treetime/argument_parser.py", line 237, in toplevel timetree(params) File "/root/anaconda3/lib/python3.8/site-packages/treetime/wrappers.py", line 306, in timetree dates = utils.parse_dates(params.dates, date_col=params.date_column, name_col=params.name_column) File "/root/anaconda3/lib/python3.8/site-packages/treetime/utils.py", line 336, in parse_dates raise err File "/root/anaconda3/lib/python3.8/site-packages/treetime/utils.py", line 285, in parse_dates raise MissingDataError("ERROR: Cannot read metadata: need at least one column that contains the taxon labels." treetime.MissingDataError: ERROR: Cannot read metadata: need at least one column that contains the taxon labels. Looking for the first column that contains 'name', 'strain', or 'accession' in the header.

This happens also when I test with your test file [ebola] (https://github.com/neherlab/treetime_examples/tree/master/data/ebola).

How can I deal with it?

— Reply to this email directly, view it on GitHub https://github.com/neherlab/treetime/issues/264, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF77AQOO72MUUJYM2MKOVGDYQ5OHPAVCNFSM6AAAAABCPARTJWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYDIOJVGI3DMNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

timeresistance1996 commented 7 months ago

Thank you for your reply!

I was run with "treetime --aln ebola.fasta --tree ebola.nwk --dates ebola.metadata.csv" under the file path.

Plasmegnago commented 7 months ago

Hi. I'm new to Treetime, but I had a similar error. I solved it removing the ">" from the ".csv" file.

timeresistance1996 commented 7 months ago

Hi Plasmegnago,

Thanks for that. But are you certain that there is a ">" in the ".csv" file?

This does not exist in “ebola.metadata.csv” from my observation.

Plasmegnago commented 7 months ago

I am sorry. I didn't check the "ebola.metadata.csv", clearly the error is not the same. I did download the same dataset and try your command "treetime --aln ebola.fasta --tree ebola.nwk --dates ebola.metadata.csv". It works. Maybe try to check your ".csv". When i open it with a text editor (notepad++) it looks like this: name, date EM_COY_2015_015982, 2015.30 G3676, 2014.40 EM_COY_2015_015980, 2015.30 G3670, 2014.40 CON-10590, 2015.61 NM042, 2014.42 EM_079497, 2014.26 2507_C2_10080_EMLK, 2015.06 G3677, 2014.40 LIBR10180, 2014.77 KG12, 2015.40 G3770, 2014.44 Makona-G5114.1, 2014.63 EM_COY_2015_023747, 2015.69 EM_FORE_2015_1023, 2015.49 J0170, 2014.86

I hope this will help.

rneher commented 7 months ago

@timeresistance1996 and @Plasmegnago

thanks for using TreeTime!

I ran the command you posted treetime --aln ebola.fasta --tree ebola.nwk --dates ebola.metadata.csv in the ebola data directory and the command completed without problem. This is on python 3.12 on ubuntu. Do you still have this problem? If so, you provide some more detail?

best, richard

timeresistance1996 commented 6 months ago

@rneher and @Plasmegnago,

This is because I copied the wrong delimiter, I have resolved the issue. Thank you so mush.