EBI-Metagenomics / EukCC

Tool to estimate genome quality of microbial eukaryotes
GNU General Public License v3.0
35 stars 9 forks source link

Clarifications for EukCC outputs and contamination identification #42

Open Edouard94 opened 1 year ago

Edouard94 commented 1 year ago

Dear @openpaul and EukCC team,

I am contacting you as I would like some more informations on EukCC outputs and to see how I can recover which contigs are classified as contaminants.

So I successfully ran EukC on my Protist assembled genome (Illumina seq, Spades assembly) with the cmd: sudo docker run -v /home/edouard/eukcc/:/eukcc/ quay.io/microbiome-informatics/eukcc single --out /eukcc --threads 8 --extra --db /eukcc/eukccdb/eukcc2_db_ver_1.2/ /eukcc/trimmed-1kb_contigs.fasta

EukCC ran fine and I got two outputs, the eukcc.csv file and scmg_marker_table. Here is the csv file result: image

So, my genome has a completeness of 88.54% and a contamination of 5.1% and I would like to find out what are the 5.1% contamination. Can I do that with the scmg_marker_table or is there a more straightforward way to find out?

Thank you for your insights on this.

Best wishes, Edouard