Currently, it is possible for the species detection to miss major contamination, because it is only showing the top five species in the read set. If there are at least five hits to strains for the same species, it is possible that the contaminating species does not appear.
We should use 'mash screen', probably through the medium of 'refseq_masher contains' to identify species present in the read set and flag for possible contamination. This will involve rewriting parts of organism_detection.py.
Currently, it is possible for the species detection to miss major contamination, because it is only showing the top five species in the read set. If there are at least five hits to strains for the same species, it is possible that the contaminating species does not appear.
We should use 'mash screen', probably through the medium of 'refseq_masher contains' to identify species present in the read set and flag for possible contamination. This will involve rewriting parts of organism_detection.py.
Consider leveraging: