DaehwanKimLab / centrifuge

Classifier for metagenomic sequences
GNU General Public License v3.0
246 stars 73 forks source link

centrifuge on CGC: custom database and index format #185

Open BioRB opened 4 years ago

BioRB commented 4 years ago

I'm using centrifuge on the cloud (CGC). in CGC there are ready to use apps to generate the index and to run centrifuge classifier but I'm not able to make it works. I would luike to use a custom database of sequences but for me is not clear how to do that. I cannot find a clear guide on how to generate the conversion table, the taxonomy tree and the name table. furthermore, if I use for exambple the viral database I'm able to genrate an index (not custom) but one I launch the run on CGC I get an empty output. If I generate the index locally (again not custom) I get *.cf files and tar files are instead required from the script on CGC. Shell I convert the cf files in tar or there is a way to generate index files in tar format?

khyox commented 4 years ago

Regarding your issues, you may find useful the wiki of Recentrifuge, as it contains a section about how to build (step-by-step) a custom database for Centrifuge: https://github.com/khyox/recentrifuge/wiki/Centrifuge-nt

danicic7 commented 4 years ago

@BioRB If you are still having issues with building the index, or with getting the right outputs when running Centrifuge on CGC, feel free to contact the support team, and they will help you out.