Closed jolespin closed 4 years ago
I don't think this is the right tool if you want to scan reads directly.
i think --reject
means it won't be in the GFF file output, and --lencutoff
is to still keep it but label it as partial. You would need to bring those down dramatically yes. 16S is ~1542 bp long, so 0.1 might work. Or just set them to zero (0).
You could just take the HMM Model files from the db
folder and run hmmer
yourself?
Maybe even bwa/minimap2 first to bait all the reads against a DB of rRNA genes, then assemble those, or scan those directly.
Thanks for the suggestions here! I realized that I was initially supposed to identify ribosomal proteins instead of rRNA. Had some trouble using phylosift
with some weird dependency issues so I thought this would have been a good alternative but they do two different things. In the future, I'll definitely continue to use barrnap
for identifying rRNA sequences.
Ah ... yes there are 20-30 ribosomal proteins. They are quite conserved and easy to find in assemblies. Good luck.
@tseemann what would you recommend as "relaxed" and "strict" settings in pulling out rRNA from metagenome-assembled genomes (MAG)?
I've quality trimmed my HISEQ reads and converted to fasta. I want to run these through
barrnap
but I'm not getting any hits with default settings. I was wondering how I could adjust these parameters to properly utilize barrnap while casting a wide net.Is
lencutoff
the proportion the target rRNA gene that is covered by the query sequence? If so, should I drop this down to something like 0.01?I'm confused on how
reject
is different thanlencutoff
. How would you adjust this for properly incorporating reads?For
evalue
I was going to drop it down to 0.1 to cast a wide net. Do you think this is too permissive?My sequences are around 200 bp long.