rajewsky-lab / mirdeep2

Discovering known and novel miRNAs from small RNA sequencing data
GNU General Public License v3.0
135 stars 49 forks source link

mirdeep2 aborts when having non-canonical chromosome identifiers in genome file #81

Closed biogerman closed 3 years ago

biogerman commented 3 years ago

Hi all,

I encountered a problem when using mirdeep2.

miRDeep2.pl \                                                                                                                                                                                                      
  C2_trimmed_collapsed.fa \                                                                                                                                                                                          
  genome_nowhitespace.fa \                                                                                                                                                                                           
  C2_trimmed_reads_vs_refdb.arf \                                                                                                                                                                                    
  mature.fa \                                                                                                                                                                                                        
  none \                                                                                                                                                                                                             
  hairpin_ok.fa \                                                                                                                                                                                                    
  -d \                                                                                                                                                                                                               
  -z _C2_trimmed_collapsed
  #####################################                                                                                                                                                                              
  #                                   #                                                                                                                                                                              
  # miRDeep2.0.1.2                    #                                                                                                                                                                              
  #                                   #                                                                                                                                                                              
  # last change: 22/01/2019           #                                                                                                                                                                              
  #                                   #                                                                                                                                                                              
  #####################################                                                                                                                                                                              

  miRDeep2 started at 20:8:59                                                                                                                                                                                        

  #Starting miRDeep2                                                                                                                                                                                                 

Command error:                                                                                                                                                                                                       
  #Starting miRDeep2                                                                                                                                                                                                 
  /opt/nf-core/work/conda/nf-core-smrnaseq-1.1.0-f6c2597d5709083f0a5f98235a2d2650/bin/miRDeep2.pl C2_trimmed_collapsed.fa genome_nowhitespace.fa C2_trimmed_reads_vs_refdb.arf mature.fa none hairpin_ok.fa -d -z _C2_trimmed_collapsed                                                                                                                                                                                                   

  miRDeep2 started at 20:8:59                                                                                                                                                                                        

mkdir mirdeep_runs/run_09_08_2021_t_20_08_59_C2_trimmed_collapsed

The mapped reference id chr11_KI270721v1_random from file C2_trimmed_reads_vs_refdb.arf is not an id of the genome file genome_nowhitespace.fa

Work dir:
/opt/nf-core/work/c6/728ab7f1e40ee5edf344ba06d752f8

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

biogerman commented 3 years ago

fix this issue, I modify miRDeep2.pl in line 375

    ## get ids from arf file and compare them with ids from the genome file
    $tmps = `cut -f6 $file_reads_vs_genome|sort -u`;
    foreach my $s(split("\n",$tmps)){
        $s =~ s/_//g;
        if(not $genomeids{">$s"}){ die "The mapped reference id $s from file $file_reads_vs_genome is not an id of the genome file $file_genome\n\n";}
    }
Drmirdeep commented 3 years ago

It is very unclear why this is fixing your issue since the error message given doesn’t fit. Moreover, it this was the issue then the fix isn’t in line 375 but instead somewhere upstream when the arf file or genome file is parsed and where the ids must be found.