zyxue / ncbitax2lin

🐞 Convert NCBI taxonomy dump into lineages
MIT License
138 stars 29 forks source link

suggestion: output lineage tax id instead of full name #17

Open mewu3 opened 2 years ago

mewu3 commented 2 years ago

If it's possible to have lineage tax id instead of full name ?

taxid   kindom  phylum  class   order   family  genus   species
11138   10239   2732408 2732506 76804   11118   694002  694005
123595  10239   2732408 2732506 76804   11118   694002  694005
11138   10239   2732408 2732506 76804   11118   694002  694005
11138   10239   2732408 2732506 76804   11118   694002  694005
11128   10239   2732408 2732506 76804   11118   694002  694003
160235  10239   2732408 2732506 76804   11118   694013  694014
11120   10239   2732408 2732506 76804   11118   694013  694014
249065  10239   2732408 2732506 76804   11118   694002  694009
249069  10239   2732408 2732506 76804   11118   694002  694009
258508  10239   2732408 2732506 76804   11118   694002  694009
11120   10239   2732408 2732506 76804   11118   694013  694014
270642  10239   2732408 2732506 76804   11118   693996  277944
267385  10239   2732408 2732506 76804   11118   694002  694009
31631   10239   2732408 2732506 76804   11118   694002  694003
11120   10239   2732408 2732506 76804   11118   694013  694014
694009  10239   2732408 2732506 76804   11118   694002  694009
zyxue commented 2 years ago

Feel free to send a pull request. You may add an option like --output-tax_id.

Xueliang24 commented 1 year ago

where can I know more option about ncbitax2lin The help information is simple

Xueliang24 commented 1 year ago

Feel free to send a pull request. You may add an option like --output=tax_id.

I used the option --output=tax_id, it just changed the name of output file to "tax_id"

Xueliang24 commented 1 year ago

If it's possible to have lineage tax id instead of full name ?

taxid kindom  phylum  class   order   family  genus   species
11138 10239   2732408 2732506 76804   11118   694002  694005
123595    10239   2732408 2732506 76804   11118   694002  694005
11138 10239   2732408 2732506 76804   11118   694002  694005
11138 10239   2732408 2732506 76804   11118   694002  694005
11128 10239   2732408 2732506 76804   11118   694002  694003
160235    10239   2732408 2732506 76804   11118   694013  694014
11120 10239   2732408 2732506 76804   11118   694013  694014
249065    10239   2732408 2732506 76804   11118   694002  694009
249069    10239   2732408 2732506 76804   11118   694002  694009
258508    10239   2732408 2732506 76804   11118   694002  694009
11120 10239   2732408 2732506 76804   11118   694013  694014
270642    10239   2732408 2732506 76804   11118   693996  277944
267385    10239   2732408 2732506 76804   11118   694002  694009
31631 10239   2732408 2732506 76804   11118   694002  694003
11120 10239   2732408 2732506 76804   11118   694013  694014
694009    10239   2732408 2732506 76804   11118   694002  694009

Could you use ncbitax2lin --nodes-file taxdump/nodes.dmp --names-file taxdump/names.dmp? Or could you share your code?

zyxue commented 1 year ago

@Xueliang24 , sorry for confusion. I meant --output-tax-id instead, but it isn't implemented yet, so not expected to work.