steineggerlab / conterminator

Detection of incorrectly labeled sequences across kingdoms
GNU General Public License v3.0
79 stars 7 forks source link

corrected contig length in {RESULT_PREFIX}_conterm_prediction file #13

Open casolp opened 3 years ago

casolp commented 3 years ago

Hi, I am searching for contaminant sequences in a genome assembly (using the NT database) and am a bit confused about the values in the "Corrected contig length" column in the {RESULT_PREFIX}_conterm_prediction output file. I think I was expecting all the sizes in this column to be <20kb but I find some that are above 20Kb. Example below:

125736 LC484010.1 2 Mus musculus 13409 13766 33867 CP056483.1 0 Klebsiella sp. RHBSTW-00464 6331260 1934 125736 JN947498.1 2 Mus musculus 18036 18393 38918 CP056483.1 0 Klebsiella sp. RHBSTW-00464 6331260 1934

Am I understanding the output correctly? Thanks!