Dfam-consortium / RepeatModeler

De-Novo Repeat Discovery Tool
Other
182 stars 23 forks source link

recoveryDir issue #223

Open rhysf opened 7 months ago

rhysf commented 7 months ago

Hi, i'm using RepeatModeller-2.0.3. One of my long runs ended with the slightly retracted info:

RepeatModeler Round # 6

Searching for Repeats -- Sampling from the database...

I saw that that file was present but empty. I therefore deleted that directory, and attempted to re-run with the following parameters:

-recoverDir ./RM_74892.WedNov82013162023/

Which stated "Oops...the ./RM_74892.WedNov82013162023/ run did not get passed round-1.".

However, when i look in each of my round- directories, i see a consensi.fa, and it is only an empty file in round-2.

I looked in the code, and i see a potential issue (providing i understand what the code is trying to do), which is that it counts from 1..100, and when it finds a directory without a file or one with a zero file size, it ends the loop - meaning it won't identify later directories that have a valid one. Assuming the largest round int is the one to work from, i found the following code finds it (while retaining the otherwise previous logic):

614 my @list_of_dirs = ls -d $recoverDir/round-*; 615 foreach my $dir(@list_of_dirs) { 616 chomp $dir; 617 my @dir_parts = split /-/, $dir; 618 my $round = $dir_parts[scalar(@dir_parts) - 1]; 619 #warn "dir $dir = $round\n"; 620
621 # round 1 622 if(($round eq 1) && (-s "$recoverDir/round-1/consensi-refined.fa")) { 623 $highestGoodRound = $round; 624 } 625 if(($round > 1) && ($round > $highestGoodRound) && (-s "$recoverDir/round-$round/consensi.fa" )) { 626 $highestGoodRound = $round; 627 } 628 } 629 warn "highest good round = $highestGoodRound\n";

with this code, i am now able to begin recovery from round 6 correctly.