zyxue / ncbitax2lin

🐞 Convert NCBI taxonomy dump into lineages
MIT License
138 stars 29 forks source link

How to use your script? #8

Closed nicolereynolds1 closed 4 years ago

nicolereynolds1 commented 4 years ago

Hi, I am a PhD student with very little coding skills, but I have a large dataset of sequences with NCBI taxIDs for which I would like to get the lineage information. I have two problems:

  1. I made the environment as specified, but when I tried make, I got the errors

    /bin/sh: md5sum: command not found
    make[1]: *** [taxdump.tar.gz] Error 127
    make: *** [taxdump] Error 2

    Do you have any suggestions how I can fix these errors?

  2. I am unsure how to use your code. I have a text file with just a list of the taxIDs, but I also have the output from the BLAST search with other info in it. I tried just Could you please provide an example, or offer me some suggestions of what I need to do?

Thank you!

zyxue commented 4 years ago

you need to install coreutils, which has md5sum https://github.com/coreutils/coreutils/blob/master/src/md5sum.c

zyxue commented 4 years ago

is https://gitlab.com/zyxue/ncbitax2lin-lineages/blob/master/lineages-2019-02-20.csv.gz good for you use case? it's pre-generated