Closed mickey-spongebob closed 3 years ago
@mickey-spongebob:
Hi Kevin,
Could you try if you can reproduce this with the first 10 lines of collapsed.fa
?
Best, Marcel
Hi Marcel,
Thanks for the reply. I've tried it with the first 10 and 20 lines of collapsed.fa, and it still isn't working :-( I get the exact same error as above.
Here is the first 10 lines of the 'collapsed.fa' file:
seq_0_x11922267 TGACTAGATCCACACTCATCC seq_11922267_x10117130 AACTCTGAGCGGTGGATCACTCGGCTCGTGCGTCGATGAAGAGCGCAGCCAGCTGCGAGAAGTGATGTGAAT seq_22039397_x9348851 AATGGCACTGGTAGAATTCACGG seq_31388248_x4446651 TGGAATGTAAAGAAGTATGTAG seq_35834899_x2568266 CAACTCTGAGCGGTGGATCACTCGGCTCGTGCGTCGATGAAGAGCGCAGCCAGCTGCGAGAAGTGATGTGAA
Could it be in how I'm pre-processing my reads? I can summarise and say that we receive "*fq.gz" files from the small RNA sequencing, I first concatenate all read files equating to approx. 94GB, I remove the adapters using AdapterRemoval https://adapterremoval.readthedocs.io/en/stable/. I then remove an extra 4 base pairs that were used for barcoding and this leads to the 'smallRNAs.fq' file I described above. Then running the bowtie-build and mapping goes seemingly fine, and only the prediction fails :-(
Sorry for all the trouble and thank you once more for the response :-) Any more advice would be super helpful :-)
Best, kevin
On first look this seems fine to me. Except for the missing >
for the FASTA header lines but my guess is thiese just got mixed up with markdown quotations?
What genome are you running this on.
If I can reproduce your genome.fa
and collapsed_genome.arf
I could try running miRDeep2.pl collapsed.fa genome.fa collapsed_genome.arf none none none
with collapsed.fa
containing
>seq_0_x11922267
TGACTAGATCCACACTCATCC
>seq_11922267_x10117130
AACTCTGAGCGGTGGATCACTCGGCTCGTGCGTCGATGAAGAGCGCAGCCAGCTGCGAGAAGTGATGTGAAT
>seq_22039397_x9348851
AATGGCACTGGTAGAATTCACGG
>seq_31388248_x4446651
TGGAATGTAAAGAAGTATGTAG
>seq_35834899_x2568266
CAACTCTGAGCGGTGGATCACTCGGCTCGTGCGTCGATGAAGAGCGCAGCCAGCTGCGAGAAGTGATGTGAA
myself. Without reproducing you problem, I'll have a hard time helping any further.
Also, can you confirm that the tutorial included with miRDeep2 runs fine using your installation?
And could you pipe the first 10 lines through od -c
and check if there are any unusual non-printable characters?
For example, if you see \r\n
line breaks instead of simple \n
ones, dos2unix
might fix your issue.
Looks the things you posted are either not from a mirdeep2 installation from the GitHub repo or you copy pasted just some lines and left out some others.
However, I don't get any errors using the '5' reads from above.
If it's not a secret then please post the full screen output here including the command you are using to call miRDeep2.
Hi
On first look this seems fine to me. Except for the missing
>
for the FASTA header lines but my guess is thiese just got mixed up with markdown quotations?
Yup, the files actually do have the '>' so it is a mix up with the markdown :-)
What genome are you running this on. If I can reproduce your
genome.fa
andcollapsed_genome.arf
I could try runningmiRDeep2.pl collapsed.fa genome.fa collapsed_genome.arf none none none
withcollapsed.fa
containing
I'm running this on a Platynereis dumerilii genome for which we have recently assembled and are annotating :-)
>seq_0_x11922267 TGACTAGATCCACACTCATCC >seq_11922267_x10117130 AACTCTGAGCGGTGGATCACTCGGCTCGTGCGTCGATGAAGAGCGCAGCCAGCTGCGAGAAGTGATGTGAAT >seq_22039397_x9348851 AATGGCACTGGTAGAATTCACGG >seq_31388248_x4446651 TGGAATGTAAAGAAGTATGTAG >seq_35834899_x2568266 CAACTCTGAGCGGTGGATCACTCGGCTCGTGCGTCGATGAAGAGCGCAGCCAGCTGCGAGAAGTGATGTGAA
myself. Without reproducing you problem, I'll have a hard time helping any further.
Also, can you confirm that the tutorial included with miRDeep2 runs fine using your installation? And could you pipe the first 10 lines through
od -c
and check if there are any unusual non-printable characters? For example, if you see\r\n
line breaks instead of simple\n
ones,dos2unix
might fix your issue.
I just checked the installation via using the tutorial dataset, and it indeed states - "Error: problem with mature_ref_this_species.fa". I tried running 'sanity_check_mature_ref.pl mature_ref_this_species.fa', and it doesn't output an error, which is confusing.
I've so far tried two installations 1, the Conda version - which I quickly aborted due to several errors, and then 2, the one installed on our local cluster, which I assumed worked well but perhaps I should ask them to re-install the software.
Maybe I'll get them to re-install before I bother you again on this issue so I'll close it for now and re-open it if the issue persists :-)
Thanks you for your patience and I'll let you know how it went!
Best, kevin
Looks the things you posted are either not from a mirdeep2 installation from the GitHub repo or you copy pasted just some lines and left out some others.
However, I don't get any errors using the '5' reads from above.
If it's not a secret then please post the full screen output here including the command you are using to call miRDeep2.
Not a secret, here is the output, and the exact commands are as follows:
module load miRDeep2/0.1.3-foss-2019b-Python-3.7.4
bowtie-build pdumv2.fa pdumv2
mapper.pl pdum_trim_clip_smallRNAs.fq -e -h -i -j -m -l 18 -p pdumv2 \ -s pdum_all_filt_collapsed.fa -t pdum_all_collapsed_genome.arf -v
miRDeep2.pl pdum_all_filt_collapsed.fa pdumv2.fa pdum_all_collapsed_genome.arf \ none cte.fas none 2>report.log
Here is the report.log output:
/g/easybuild/x86_64/CentOS/7/rome/software/miRDeep2/0.1.3-foss-2019b-Python-3.7.4/bin/miRDeep2.pl pdum_all_filt_collapsed.fa pdumv2.fa pdum_all_collapsed_genome.arf none cte.fas none
miRDeep2 started at 14:13:28
mkdir mirdeep_runs/run_23_11_2021_t_14_13_28
started: 14:14:17 sanity_check_mature_ref.pl cte.fas
ESC[1;31mError: ESC[0mproblem with cte.fas
But as I mentioned above, I will re-install myself and then get back to the thread, should the issue persist :-)
If it's ok with everyone, I shall close this thread and keep you updated :-)
Thank you for the help :-)
Hi all,
So I just confirmed that both two installations via Conda and another one which I can't comment on installed on the cluster did not work. However, the installation onto a local computer (also works on a virtual machine with enough RAM and storage) is working just fine.
Sorry for the trouble and thank you @mschilli87 and @Drmirdeep for the patience and help :-)
Re-closing again :-)
Best, kevin
Hi 👋 ,
Hope this finds you well! I am writing to ask about help regarding my current miRDeep2 analysis. The installation seems to work fine as all commands are working accordingly, including the initial tests, and all the required software are also installed correctly. When I run the following scripts:
1. build the index
bowtie-build genome.fa genome
2. process and map reads to reference
mapper.pl smallRNAs.fq -e -h -m -l 18 -p genome \ -s collapsed.fa -t collapsed_genome.arf -v -o 100
3. predict miRNA
miRDeep2.pl collapsed.fa genome.fa collapsed_genome.arf \ none none none 2>report.log
Steps 1 and 2 work fine, with no errors and all the output files that one should get exist. However, step 3 gives an error:
started: 16:40:39 ESC[1;31mError: ESC[0mproblem with collapsed.fa
Strangely, when I do check the file using 'sanity_check_reads_ready_file.pl collapsed.fa', it doesn't give an error.
Any ideas on what I could be doing wrong? I apologise if this is redundant or something silly, but I can't seem to figure it out :-(
Any advice would be kindly appreciated and thank you for the really nice tool!
Best, kevin