Closed anacarolsoares closed 1 year ago
Hi Ana,
Sorry for the slow response. I think this might be due to finding more than one match for a given template, but let's investigate. You are using contigs (not paired-end reads) I assume? Which flags did you use with KMA?
Vanessa
Hi!
Yes, we are using contigs.
I'm using the following command:
Kma -i contigs.fasta -o outFasta -t_db databasePath -ca -1t1 -mem_mode -ef
Thanks for your reply. Ana Carolina.
Hi!
Okay, a few more questions: Could you also tell us the command you used with ccmetagen? Which database you used, the NCBI nt?
Not sure if this is your case but the -1t1 flag may be tricky to use with contigs (unless you are using a ref. database of complete genomes): you are telling KMA to find only one match for that scaffold, but there might be multiple genes (and therefore multiple equally good matches) in the database.
Hi!
ccmetagen command:
CCMetagen.py -i inputFileFasta -o outFasta --depth_unit fr --map inputMapFasta --depth 1 --query_identity 80 -ef y
Yes. NCBI nt.
I see your point. We are going to test without the -1t1 flag.
Thanks for your reply.
Closing issue due to inactivity. Feel free to open it again if you need help.
Hi! I have noticed a discrepancy on the number of fragments (in this case scaffolds) classified as a certain specie in output files from _scaffolds.ccmetagen.ccm.csv and _scaffolds.kmaout.frag.
For file _scaffolds.ccmetagen.ccm.csv from CCmetagen the sum of column Depth for rows classified as a certain specie (e.g. Taenia solium) is 5. But I could find just 4 fragments in the KMA output _scaffolds.kmaout.frag.
I used as key columns to match: [Closest_match] - _scaffolds.ccmetagen.ccm.csv [template_name] - _scaffolds.kmaout.frag
Is there any explanation for this?
Looking forward to your reply. Thanks.
ed_scaffolds.ccmetagen.ccm.csv
*_scaffolds.kmaout.frag.