Dfam-consortium / RepeatMasker

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
Other
226 stars 49 forks source link

RepeatMasker apparently requires the input fasta file allow write access, even though it won't write to it. #184

Closed rsharris closed 1 year ago

rsharris commented 1 year ago

Describe the issue

This command "RepeatMasker -parallel 63 -xsmall -species Primates temp/chrY.fasta" failed with the error message FastaDB::_cleanIndexAndCompact() - Error could not open file some_path/RM_49156.WedOct191733472022/chrY.fasta: Permission denied at someotherpath/.conda/envs/repeatmasking.env/bin/RepeatMasker line 739. I was baffled because I had 7 such jobs to run, on different fasta files, and only this one failed. And RM_49156.WedOct191733472022/chrY.fasta defintiely existed.

I eventually noticed that RM_49156.WedOct191733472022/chrY.fasta had read-only permissions. That file is a copy of my input fasta. So RepeatMasker made a copy but kept the original access privs. Then (aparently) objected that it was unable to write.

Reproduction steps

chmod whatever temp/chrY.fasta (do something to disable write privs) RepeatMasker -parallel 63 -xsmall -species Primates temp/chrY.fasta

Log output

FastaDB::_cleanIndexAndCompact() - Error could not open file some_path/RM_49156.WedOct191733472022/chrY.fasta: Permission denied

Environment (please include as much of the following information as you can find out):

It's a conda environment, created Apr/1/2022 with this command

conda create --name repeatmasking.env \
  --channel bioconda \
  RepeatMasker trf

RepeatMasker version 4.1.2-p1 Search Engine: NCBI/RMBLAST [ 2.10.0+ ]

Unsure ... this was on a slurm cluster node. I don't have direct access to the nodes. Hypothetically I could submit a job just to run uname -a but I'd have to wait for the job to run. And then I'd have to try to figure out if the node used for the new job was of the same type as the node used for the job that failed.

Additional context

I ran the same command on 6 other files and it didn't have this problem. Eventually I figured out that the file for which it failed was write-protected, while the other 6 were not.

I then made a copy of the file for which RepeatMasker failed, allowing write access, ran RepeatMasker with that new file, and it did not have the problem.

rmhubley commented 1 year ago

Actually the message isn't indicating that it's attempting to write to your original file "temp/chrY.fasta" but rather a copy of it that it wants to store in "some_path/RM_49156.WedOct191733472022/chrY.fasta". The code simply uses a "cp" command which evidently maintains the original file permissions. I'll fix this in the next release, although for now you could simply backup the original input file and make sure the one you hand to RepeatMasker has read-write permissions.

Thanks for reporting this, I will leave this open until the next release.

rsharris commented 1 year ago

Cool, thanks.

I did realize that it was a copy it was failing to overwrite. I was unsure how RM was making that copy. I wonder if the "cp" command behaves differently, w.r.t. permissions, on different versions of *nix.

rmhubley commented 1 year ago

That's a good point. To deal with this I simply added a "chmod 0700" after the copy. I should have a new release out in the next week -- containing this change and a major new release of RMBlast.

rmhubley commented 1 year ago

Fixed in 4.1.4