Open ShannonDaddy opened 2 years ago
Hi @ShannonDaddy. This is something that I've been considering since I implemented taxid_from_name
, but I was worried about the significant increase in memory usage. As far as I know ETE3 uses a sqlite database, so memory is not really a problem for them.
That said, I can add a load_synonyms
parameter (disabled by default) that would allow synonyms and equivalent names to be added to the database. Is this feature urgent for you?
Hi @ShannonDaddy. This is something that I've been considering since I implemented
taxid_from_name
, but I was worried about the significant increase in memory usage. As far as I know ETE3 uses a sqlite database, so memory is not really a problem for them.That said, I can add a
load_synonyms
parameter (disabled by default) that would allow synonyms and equivalent names to be added to the database. Is this feature urgent for you?
It's not urgent for me. Temporarily, I just create the Taxon object using taxid directly. You can take your time to add the new feature. Thanks for the quick response.
I would like this feature as well! I have used ete3, but I prefer this library for the LCA functions and because it works in situations where memory is available, but persistent disk space is not.
Thanks for the feedback. This looks like a useful feature to lots of people (I also ended up needing it recently). I'll think about how to implement it without hugely increasing memory usage as soon as I get some free time.
Hi, when I call function taxid_from_name to get taxid, I get some warnings.
my code: import taxopy
ncbi_taxdb_dir = "database/ncbi_taxonomy" taxdb = taxopy.TaxDb(nodes_dmp=f"{ncbi_taxdb_dir}/nodes.dmp", names_dmp=f"{ncbi_taxdb_dir}/names.dmp", merged_dmp=f"{ncbi_taxdb_dir}/merged.dmp", keep_files=True) taxid_list = taxopy.taxid_from_name('Lactobacillus fermentum', taxdb) print(taxid_list)
the console output: [] C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\taxopy\utilities.py:54: Warning: The input name was not found in the taxonomy database. warnings.warn("The input name was not found in the taxonomy database.", Warning)
Then, I checked the names.dmp and found that 'Lactobacillus fermentum' is a synonym, the scientific name is 'Limosilactobacillus fermentum'. When I use the scientific name in the code, the output is fine.
Is it possible to support synonym or equivalent name when calling taxid_from_name just like another python package ete3 would do?
Thanks a lot!