Open jolespin opened 2 years ago
Also, one more edit I've found:
--links LINKS Path to a link table generated with bamlinks.py. If suuplied paired reads will be used to refine bins (Recommended)
Should be binlinks.py for the script (and the typo)
Thank you, thats a very good comment. Our institute is also stingy with the home folder, thus I have a symlink for the ete folder. But you are right, this would make the installation easier.
I will have to look up how to provide a different folder for ete but I think I saw that option somewhere. Then it should be pretty easy to just distribute this database.
It's really easy, what I usually do is this:
DATABASE_TAXA="/usr/local/scratch/CORE/jespinoz/db/ncbi_taxonomy/v2021.08.03/taxa.sqlite"
...
parser.add_argument("-t","--database_taxa", type=str, required=False, default=DATABASE_TAXA, help = "taxa.sqlite [Default: {}]".format(DATABASE_TAXA))
...
ncbi = NCBITaxa(dbfile=opts.database_taxa)
It will also make everything more consistent too b/c if you run it a ear from now then it might download a different taxadb.
In the same vein, it might be useful to provide a --taxa_sqlite option or something for the ncbi_update submodule:
but make the default the taxadump.tar.gz
file (the prospective one in the update eukcc database)
Looking forward to integrating this into an essential pipeline at JCVI but this part is a limiting factor.
One more question, does EukCC save the MetaEuk genes, proteins, and gff file?
Here's my command:
Here's my log:
This database should be in
eukcc2_db_ver_1.1
and called directly from there instead of downloading it on the first run to the home directory.My institute only allocates 5GB to our home directories so mine is pretty much full already.