Open sarah872 opened 2 years ago
Hi Sarah,
Thanks for bringing this to our attention. It looks like an issue with running prodigal an input fasta less than 20000 bp. Prodigal is normally used to predict coding sequences from genomes, and since snapt only called 3 ncRNAs in the previous steps then the input fasta for re-running prodigal is <20000 bp.
@ursky, what do you think? I think this issue hasn't arose for us before because we usually get 100+ ncRNA calls which yields 20000+ bp of sequence. Found similar issue here: https://github.com/hyattpd/Prodigal/issues/51
Looks like a quick fix is to run (a) prodigal separately in anonymous mode or meta mode (-p option). If you wanted to complete the snapt pipeline with prodigal, then it looks like you can (b) edit prodigal itself by change line 32:
Not sure I recommend this route though since you could break prodigal that way. Will see if I can update snapt to include -p option for prodigal and see if that solves it.
Hi, thanks for suggesting a quick fix. I modified the script to include the -p meta
flag with prodigal.
However, there's another issue now: No antisense nc transcripts have been found, which leads to this:
------------------------------------------------------------------------------------------------------------------------
----- running DIAMOND blastx on /tmp/slurm-5401156/SNAPT_OUT/blastx_search/intergenic.fa -----
----- against the /tmp/slurm-5401156/nr database -----
------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------
----- running DIAMOND blastx on /tmp/slurm-5401156/SNAPT_OUT/blastx_search/antisense.fa against -----
----- the /tmp/slurm-5401156/nr database -----
------------------------------------------------------------------------------------------------------------------------
Error: Error detecting input file format. First line seems to be blank.
************************************************************************************************************************
***** Failed DIAMOND Blastx search against the database. Exiting... *****
************************************************************************************************************************
The file antisense.fa
was indeed empty. I suppose a simple if
to check whether the file is empty or not before executing diamond
would suffice? What do you think?
Hi, I'm running
snapt
on a closed bacterial genome. It's running fine until theCURATE NON_CODING TRANSCRIPTS
step, for whichprodigal
compains about a sequence being too short. Do you have any idea how to troubleshoot?Thank you!