BinPro / CONCOCT

Clustering cONtigs with COverage and ComposiTion
Other
120 stars 48 forks source link

Question about Validation using single-copy core genes #210

Closed franciscozorrilla closed 5 years ago

franciscozorrilla commented 5 years ago

Hi,

It is not clear to me where the COGSDB_DIR path should be pointing to, in the tutorial it says its pointing to /proj/b2010008/nobackup/database/cog_le/, but there is no mention as to what is in that location. Could you clarify this for me?

Also, I was wondering if it is appropriate to use the files scg_cogs_min0.97_max1.03_unique_genera.txt and cdd_to_cog.tsv when running CONCOCT for a real dataset?

Thanks in advance!

alneberg commented 5 years ago

Hi @franciscozorrilla,

As you have already noticed in #198, the tutorial is not very well maintained ;). I would actually not use the cog evaluation at all. We currently recommend using CheckM or possibly anvi'o for evaluation of the clusters. But CheckM is really simple to use and is of higher quality than our COG-based SCG analysis is.

But to answer your question, that directory would contain the rpsblast-databases for COG as downloaded from the public source, which is NCBI I believe.

franciscozorrilla commented 5 years ago

@alneberg thanks again for the helpful information!