shenwei356 / taxonkit

A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV
https://bioinf.shenwei.me/taxonkit
MIT License
361 stars 29 forks source link

R bindings for taxonkit? #41

Closed johanneswerner closed 3 years ago

johanneswerner commented 3 years ago

Hello,

I am quite happy with taxonkit and discovered yesterday that there are is also pytaxonkit - is there also an R package with R bindings to taxonkit?

If not, can anyone help me to start developing it? (I have developed R packages in the past, I am just not sure in how to connect the functionality of taxonkit inside R)

Thank you.

PS: I know an issue is not the best location for this request, unfortunately I did not have a better idea.

standage commented 3 years ago

I wrote Python bindings for TaxonKit recently, if you’d like to use that as a reference. I primarily used the Popen construct in Python’s subprocess library to call TaxonKit, and then loaded the output into dataframes (pandas) for handling in Python.

https://github.com/bioforensics/pytaxonkit/blob/9746225b1c0a9eff708790037e3b53e5d45ac235/pytaxonkit.py#L203-L211

It’s been a long time since I did any serious work in R, so I’m not sure what the best tools are for system calls. But I imagine R’s native dataframes would be suitable for storing most results.

I hope this helps.

johanneswerner commented 3 years ago

Thank you very much. This is indeed helpful, I am just not quite sure on how to work with this information. :-) Thank you for the clarification in your package.

shenwei356 commented 3 years ago

Sorry, I'm not familar with R package development :(

johanneswerner commented 3 years ago

For me, everything is cleared up. Thank you again.