Open dportik opened 3 years ago
@AlexanderDilthey
I am bumping this as a prebuilt database for refseq would be excellent if available.
I encountered a similar issue and has been unsuccessful so far in generating refseq database - prebuilt Refseq database would be very helpful to many users of Metamaps.
Hello, I have been trying to create a database based on RefSeq archaea, bacteria, and fungi using the command below:
I have encountered two issues so far. The first error occurred sporadically while unpacking
taxdump.tar.gz
. The error containedtar: Unexpected EOF in archive
. I am running this on HPC and suspected it was due to file latency. I solved it by addingsleep (20);
to line 78 ofdownloadRefSeq.pl
, which may be helpful to others.The second issue is more problematic and occurs when archaea finishes and bacteria begins:
Note that my script has an additional line inserted, so this corresponds to line 177 of the original script. I am unsure what this is related to, and cannot figure out a solution.
More importantly, I am not confident that if this problem is solved I can successfully build this database through the other required steps. There appear to be other unsolved database issues that have been posted here by other users (particularly 49). As the database building steps are time-consuming, I am reluctant to continue the effort unless the process is more robust.
Is it possible to host a prebuilt metamaps database for RefSeq archaea, bacteria, and fungi? I imagine this is the database most of your users will be interested in using for their analyses. Using the mini database is not particularly helpful, as most other taxonomic profilers offer access to very large databases (NCBI nt/nr, multiple RefSeq branches). This could solve at least some of the ongoing database issues, and would be a valuable resource for your user-base.