rajewsky-lab / mirdeep2

Discovering known and novel miRNAs from small RNA sequencing data
GNU General Public License v3.0
137 stars 49 forks source link

Mapping file name.arf is not in arf format #109

Closed Chenmi0820 closed 1 year ago

Chenmi0820 commented 1 year ago

Dear all,

I got this issue when I used miRDeep2.pl. Option 1 is what I used for installation.

miRDeep2 started at 10:58:50

mkdir mirdeep_runs/run_26_01_2023_t_10_58_50

readline() on closed filehandle IN at /depot/leizhang/data/chenmi/software/mirdeep2/bin/miRDeep2.pl line 363. Error: Mapping file J2_1_collapsed_vs_genome.arf is not in arf format

Each line of the mapping file must consist of the following fields readID_wo_whitespaces length start end read_sequence genomicID_wo_whitspaces length start end genomic_sequence strand #mismatches editstring The editstring is optional and must not be contained The readID must end with _xNumber and is not allowed to contain whitespaces. The genomeID is not allowed to contain whitespaces.

The command for generating the .arf file is: mapper.pl /depot/leizhang/data/chenmi/J2_1.fa -c -j -l 18 -m -p /depot/leizhang/data/chenmi/meloidogyne_incognita.PRJEB8714.WBPS17.genomic -s J2_1_collapsed.fa -t J2_1_collapsed_vs_genome.arf

Below is the head of the .arf file: seq_0_x390149 23 1 23 aacccgtagatccgaactagtct FXSY01000458.1 23 42582 42604 aacccgtagatccgaactagtct - 0 mmmmmmmmmmmmmmmmmmmmmmm seq_0_x390149 23 1 23 aacccgtagatccgaactagtct FXSY01010669.1 23 1148 1170 aacccgtagatccgaactagtct - 0 mmmmmmmmmmmmmmmmmmmmmmm seq_0_x390149 23 1 23 aacccgtagatccgaactagtct FXSY01000980.1 23 22674 22696 aacccgtagatccgaactagtct + 0 mmmmmmmmmmmmmmmmmmmmmmm seq_390149_x156044 29 1 29 ctgcccagttacaactacttgaccgtcgc FXSY01000400.1 29 63239 63267 ctgcccagttacaactacttgaccgtcgc + 0 mmmmmmmmmmmmmmmmmmmmmmmmmmmmm seq_390149_x156044 29 1 29 ctgcccagttacaactacttgaccgtcgc FXSY01000973.1 29 5720 5748 ctgcccagttacaactacttgaccgtcgc + 0 mmmmmmmmmmmmmmmmmmmmmmmmmmmmm seq_546193_x134559 29 1 29 ctacccagttgcatctacttgaccgtcgc FXSY01000973.1 29 5173 5201 ctacccagttgcatctacttgaccgtcgc + 0 mmmmmmmmmmmmmmmmmmmmmmmmmmmmm seq_546193_x134559 29 1 29 ctacccagttgcatctacttgaccgtcgc FXSY01000400.1 29 61918 61946 ctacccagttgcatctacttgaccgtcgc + 0 mmmmmmmmmmmmmmmmmmmmmmmmmmmmm seq_546193_x134559 29 1 29 ctacccagttgcatctacttgaccgtcgc FXSY01001495.1 29 3585 3613 ctacccagttgcatctacttgaccgtcgc - 0 mmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Please help check

Chenmi0820 commented 1 year ago

The command of miRDeep2.pl I used as below: miRDeep2.pl /depot/leizhang/data/chenmi/J2_1_collapsed.fa /depot/leizhang/data/chenmi/meloidogyne_incognita.PRJEB8714.WBPS17.genomic.fa J2_1_collapsed_vs_genome.arf none none none 2>report.log

Drmirdeep commented 1 year ago

If the mapper module didn’t finish then the arf file will be incomplete

Chenmi0820 commented 1 year ago

If the mapper module didn’t finish then the arf file will be incomplete

I am sure the module is finished running. I also checked the head and tail of the arf file. Is there any other possible reason for my issue?

Chenmi0820 commented 1 year ago

I finally got this issue solved. The point is to use the ../mirdeep2/bin/fastq2fasta software instead of other fastq2fasta methods, and then you can successfully generate all the results.

mschilli87 commented 1 year ago

Thx for following up.

Drmirdeep commented 1 year ago

What do you mean by other fastq2fasta tools? Did you not create the arf file by the mapper.pl ?

Chenmi0820 commented 1 year ago

What do you mean by other fastq2fasta tools? Did you not create the arf file by the mapper.pl ?

When I obtained my sRNA-seq data, I used another fastq2fasta tool which I downloaded by myself to transit the fastq into fasta format, and then used the fasta files for the mapper.pl step to create .arf files. However, when it came to the miRDeep2.pl step, it didn't work because of the ".arf is not in arf format" issue. It bothers me for a long time. Finally, I found when I used the build-in fastq2fasta tool that is located in the ../mirdeep2/bin/fastq2fasta, I can successfully generate the results.

If others face the same ".arf is not in arf format" issue, maybe you can try the build-in fastq2fasta tool. Thank you!