rmhubley / RepeatMasker

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
Other
214 stars 48 forks source link

RepeatLandscape: Could not open for reading: No such file or directory #227

Closed 44314474 closed 7 months ago

44314474 commented 11 months ago

Hi! I met a question when i used calcDivergenceFromAlign.pl to calculate divergence from align file which is generated by repeatmasker software -a parameter. My command is "perl calcDivergenceFromAlign.pl -a fastafa.align" and the error is "RepeatLandscape: Could not open for reading: No such file or directory" Thanks for your help!

Nicholas-Kron commented 11 months ago

Hi,

I had the same issue when using the command in that form because that is not the intended input. Run the command as calcDivergenceFromAlign.pl -s xyz.divsum xyz.align.gz and works as intended, generating the divsum files you need for the createRepeatLandscape.pl script.

If you are using RepeatMasker version 4.0.4 and newer the man states:

"On newer RepeatMasker dataset that already contains the Kimura divergence line following each alignment:

     ./calcDivergenceFromAlign.pl -s example.divsum example.align.gz

     ./createRepeatLandscape.pl -div example.divsum > 
                                /home/user/public_html/example.html

"

Hope that helps!

sadikmz commented 9 months ago

Hi,

I am getting the same error message when using calcDivergenceFromAlign.pl to calculate divergence from alignment.

RepeatMasker version 4.1.5

calcDivergenceFromAlign.pl -s genom.divsum -a genome.fna.align

Error: RepeatLandscape: Could not open for reading: No such file or directory

Any suggeison please?

Nicholas-Kron commented 9 months ago

TL;DR: run it without the -a flag: calcDivergenceFromAlign.pl -s genom.divsum genome.fna.align

Looking into the perls script, that error should return a file name (lines 178-179):

open $searchResultsFH, "<$alignFile"
      or die "RepeatLandscape: Could not open $alignFile for reading: $!\n";

This suggests to me that your command is not actually giving an input .align file.

alignFile is set on line 158:

my $alignFile      = $ARGV[ 0 ];

In the SYNOPSIS usage statement (Line 36-37) we see the problem:

  calcDivergenceFromAlign.pl [-version] [-s <summary_file>] [-noCpGMod]
                             [-a <new_align_file>] *.align[.gz]

So the -a flag is actually to generate a new .align file, the input align file is a positional argument not a flag.

Hope that helps!

sadikmu commented 9 months ago

My bad I was following the documentation - got it working!

May be good to update the documentation like -a

Thanks a lot for the instant response.

rmhubley commented 7 months ago

Thanks for the good pointers @Nicholas-Kron! Sorry for the sub-par documentation for this tool, I have updated it for the next release (4.1.6) to avoid this confusion. Indeed the "-a " option is an infrequently used option to generate an additional output file where the new divergence value is printed alongside each alignment. Most users are using this tool to generate a divergence summary file ("-s" option) for creating repeat landscapes and do not need this option.