Open linsalrob opened 7 years ago
Rob,
Yes you can run the database build on pre-downloaded files. Have a look in the k-SLAM install script to see what it does with the downloaded files.
You should be able to create a database directory with the correct directory structure. Extract all the gz files. Put the names.dmp/nodes.dmp in a folder called "taxonomy", the gbff files for bacteria in a folder called "bacteria" and the virus gbff files in a dir called viruses.
Once in the database directory
To make the taxonomy database:
(path to SLAM executable) --parse-taxonomy taxonomy/names.dmp taxonomy/nodes.dmp --output-file taxDB
To make the genome database:
(path to SLAM executable) --output-file database --parse-genbank bacteria/.gbff viruses/.gbff
On 21/06/17 18:48, Rob Edwards wrote:
Hi folks
For k-SLAM installatin I need to download the NCBI taxonomy, bacterial, and viral genomes (i.e. using install_slam.sh). If I already have these downloaded somewhere can I just point to the appropriate locations, rather than duplicating all the data again?
Thanks
Rob
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/aindj/k-SLAM/issues/13, or mute the thread https://github.com/notifications/unsubscribe-auth/AC5I8cP7hCInkguSSeLaYvUsVK1fIIPbks5sGVdigaJpZM4OBTT6.
Can this be done off the .fna files? Or does the script _need the .gbff files?
Also...
Any idea on what may be causing this error when trying to build the database from pre-downloaded files as described above..
/home/src/k-SLAM/SLAM: error while loading shared libraries: libboost_program_options.so.1.53.0: cannot open shared object file: No such file or directory
I can't find much about this dependency library. Any help appreciated.
gbff files are needed as they contain taxonomy information
That dependency is boost and should be fairly easy to install on any linux machine
Hi folks
For k-SLAM installatin I need to download the NCBI taxonomy, bacterial, and viral genomes (i.e. using install_slam.sh). If I already have these downloaded somewhere can I just point to the appropriate locations, rather than duplicating all the data again?
Thanks
Rob