Closed slaperriere closed 4 years ago
Thanks for reporting this.
taxonkit name2taxid
searches both scientific name
and synonym
, 629395
has a synonym of Bacteria
...
629395 | Bacteria | Bacteria <stick insect> | synonym |
629395 | Bacteria Latreille et al. 1825 | | scientific name |
629395 | Bacteria Latreille, Peletier de Saint Fargeau, Serville & Guerin, 1825 | | authority |
629395 | Bacteria stick insect | | common name |
A new flag -s/--sci-name
added for only searching scientific name
:
Great, thank you! It looks like it solved most of the problem.
However, I am still get some duplicates. Some examples are
ESP_48538 Paracoccus 265 ESP_48538 Paracoccus 249411 ESP_764 Actinobacteria 1760 ESP_764 Actinobacteria 201174 ESP_17204 Vertebrata 7742 ESP_17204 Vertebrata 1261581
it's not a bug, if you have switched on -s
. Some taxids indeed share same scientific names, you can check their lineage. For these, I duplicate these lines, you may deduplicate them using awk or csvtk, or I can add a new flag.
@slaperriere Can I close this issue?
Yes. Thank you for your help!
Hello,
I am getting duplicate values from name2taxid when running
taxonkit name2taxid -i 2 filename
My input: ESP_3 Bacteria ESP_84 Bacteria ESP_136 Bacteria ESP_149 Bacteria ESP_166 Bacteria ESP_169 Bacteria ESP_181 Bacteria ESP_187 Bacteria ESP_196 Bacteria
Output: ESP_3 Bacteria 2 ESP_3 Bacteria 629395 ESP_84 Bacteria 2 ESP_84 Bacteria 629395 ESP_136 Bacteria 2 ESP_136 Bacteria 629395 ESP_149 Bacteria 2 ESP_149 Bacteria 629395 ESP_166 Bacteria 2 ESP_166 Bacteria 629395
Some lines as seen above are duplicated with a different taxid. There are no duplicates in the input.
Do you you what could be causing this?
Thank you!