sigven / cpsr

Cancer Predisposition Sequencing Reporter (CPSR)
https://sigven.github.io/cpsr/
Other
56 stars 12 forks source link

incorrect permissions when copying templates dir with file.copy() #61

Closed MareikeJaniak closed 1 month ago

MareikeJaniak commented 1 month ago

Hi again,

Thanks for your help with the quarto issue, we are working on a solution for our installation, but in the meantime, we have come across another issue. I'm opening a separate issue here, since they are not related.

The problem we're seeing is again somewhat specific to working on an HPC. We have strict disk quotas that are organized by users and groups to which users belong. In some areas of the HPC, users have very small personal quotas, while one of the overarching groups has a large allocation for storage, so it is important that a file's group ownership is maintained to avoid disk quota errors.

During cpsr, the cpsr templates directory and its contents are copied to the quarto tmp directory with file.copy(), which creates a new directory templates within the quarto tmp directory. This new directory and the files that are copied with it have the correct group ownerships. However, the new templates directory is missing the -s- flag, so any new file created within it (like cpsr_report_sample.html and the files in cpsr_report_sample_files) has the $USER as the group, so the user's personal quota is quickly reached and the program fails with a disk quota error.

We tested the copy.mode setting within file.copy() but it did not work. (The documentation also suggests this: "security attributes such as ACLs are not copied").

A solution that did work for us was replacing the file.copy() command with the following:

cmd <- sprintf("cp -r --preserve=all %s %s", shQuote(cpsr_rep_template_path), shQuote(tmp_quarto_dir1))
system(cmd, intern = TRUE, ignore.stderr = TRUE)

Another solution that would work is to create the additional files ('cpsr_report_sample_files', 'cpsr_report_sample.html', 'cpsr_report_sample.qmd', and 'cpsr_report.rds') not within templates but in the quarto tmp parent directory, which has the correct -s- flag.

I realize that this is a somewhat unique problem, but I'm hoping that we can find a solution! Thanks in advance!

Best, Mareike

sigven commented 1 month ago

Thanks for reporting, Mareike. I see your needs here. I'll try experimenting a bit with your suggestions, it may actually be that your last solution is the easiest one to implement, to essentially lift the processing up one folder.

best, Sigve

MareikeJaniak commented 1 month ago

Thank you, Sigve!

mboisson commented 1 month ago

I don't know R, but in Python, shutil.copy does the right thing. What you want to look for is to honour the setgid bit on directories.

mboisson commented 1 month ago

I would note that --preserve=all is not needed in the system call. It is in fact harmful. You do not want to preserve the ownership (user and group).

cp -r is sufficient and will copy permissions, but not group.

If you need to preserve other attributes, you may do cp -r --preserve=all --no-preserve=ownership ...

sbathgate commented 1 month ago

You could potentially explore something along the lines of:

system2("cp", args = c("-r", shQuote("/path/to/source"), shQuote("/path/to/destination")))

I believe system2 is the more modern approach, however, I haven't worked with R for a bit, so worth confirmation.

pdiakumis commented 1 month ago

Hi folks, Thanks for all your input. The issue with cp is that it depends on the user's operating system. On my M1 Mac, for instance, technically there isn't even a -r option:

COMPATIBILITY
     Historic versions of the cp utility had a -r option.  This implementation supports that option, however, its behavior
     is different from historical FreeBSD behavior.  Use of this option is strongly discouraged as the behavior is
     implementation-dependent.  In FreeBSD, -r is a synonym for -RL and works the same unless modified by other flags.
     Historical implementations of -r differ as they copy special files as normal files while recreating a hierarchy.

That's the reason we changed the cp (and rm) calls to R language-specific (e.g. file.copy) calls. But anyway, I think cp -r is pretty standard so I've gone ahead and opened #62 (I've followed your example above, @sbathgate). If Sigve is okay with it, we can bump the version for both CPSR and PCGR and get your team to test it out.

Cheers, Peter

MareikeJaniak commented 1 month ago

Thanks, Peter!

MareikeJaniak commented 3 weeks ago

Hi Peter, hi Sigve,

I just wanted to confirm that this issue is resolved with the new release. Thanks so much!

Best, Mareike

sigven commented 3 weeks ago

Hi Mareike,

Thanks a lot for letting us know, highly appreciated :-)

best, Sigve