ropensci / nlrx

nlrx NetLogo R
https://docs.ropensci.org/nlrx
GNU General Public License v3.0
77 stars 12 forks source link

run_nl_all(): cannot access temporary file #40

Closed kiranaw closed 4 years ago

kiranaw commented 4 years ago

Dear @nldoc ,

I am setting up experiment on linux (slurm cluster) with the wolf-sheep example. I use R version 3.4.4 and Netlogo 6.0.4. It ran into this error:

run_nl_all(nl)

sh: /lustre/ssd/ws/kw/tmp/netlogo-headless2c6f268a317f.sh: Permission denied
Error in util_gather_results(nl, outfile, seed, siminputrow) :
   Temporary output file /lustre/ssd/ws/kw/tmp/nlrx4440_12c6f77e97517.csvnot found. On unix systems this can happen if the default system temp folder is used.
       Try reassigning the default temp folder for this R session (unixtools package).

It seems that I got permission denied to the file netlogo-headless2c6f268a317f.sh. However when I tried to run this particular file manually, I didn't get this error (I assume that actually I should have access to the file, then).

sh /lustre/ssd/ws/kw/tmp/netlogo-headless2c6f268a317f.sh

you must specify --model

and this is the example code I use to setup the experiment:

library(nlrx)

netlogopath <- file.path("/lustre/ssd/ws/kw/NetLogo-6.0.4/")
modelpath <- file.path("/lustre/ssd/ws/kw/NetLogo-6.0.4/app/models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("/lustre/ssd/ws/kw/out/")

# define Netlogo version used
nl <- nl(nlversion = "6.0.4",
         nlpath = netlogopath,
         modelpath = modelpath,
         jvmmem = 1024)

nl@experiment <- experiment(expname="wolf-sheep",
                            outpath=outpath,
                            repetition=1,
                            tickmetrics="true",
                            idsetup="setup",
                            idgo="go",
                            runtime=50,
                            evalticks=seq(40,50),
                            metrics=c("count sheep", "count wolves", "count patches with [pcolor = green]"),
                            variables = list('initial-number-sheep' = list(min=50, max=150, qfun="qunif"),
                                             'initial-number-wolves' = list(min=50, max=150, qfun="qunif")),
                            constants = list("model-version" = "\"sheep-wolves-grass\"",
                                             "grass-regrowth-time" = 30,
                                             "sheep-gain-from-food" = 4,
                                             "wolf-gain-from-food" = 20,
                                             "sheep-reproduce" = 4,
                                             "wolf-reproduce" = 5,
                                             "show-energy?" = "false"))

nl@simdesign <- simdesign_lhs(nl=nl,
                              samples=100,
                              nseeds=1,
                              precision=3)

The output when checking with print(nl):

supported nlversion: ✓
nlpath exists on local system: ✓
modelpath exists on local system: ✓
valid jvm memory: ✓
valid experiment name: ✓
outpath exists on local system: ✓
setup and go defined: ✓
variables defined: ✓
variables present in model: ✓
constants defined: ✓
constants present in model: ✓
metrics defined: ✓
spatial Metrics defined: ✓
simdesign attached: ✓
siminput parameter matrix: ✓
number of siminputrows: 100
number of random seeds: 1
estimated number of runs: 100
simoutput results attached: ✗
number of runs calculated: ✗

I would really appreciate any help on this issue. Thanks!

nldoc commented 4 years ago

Dear @kiranaw, unfortuantely it is quite difficult for me to help you out with this issue without knowing more details of your system. I did not run into this error before but I would say that probably the user rights for your temporary directory are not set properly. However, I still don`t know why executing the sh file works outside of R. Some people who had problems with user rights were successful when R was executed as root, but maybe that is not an option for you on the cluster. From your file path definitions I see that you have mounted a lustre file system, is that right? While I am absolutely unexperienced with lustre, I could imagine that the mount does not set the user rights in these folder correctly. Do you have any options defining user rights when you mount these folders? And did you set the /tmp/ folder on the lustre system manually or is that the default bahvior of R to use the lustre tmp folder for temporary files? Do you still have a personal home directory on the cluster that is not depending on the lustre file service? I was thinking that you could try to setup all folders (tmp, nlpath, modelpath, out) within your personal home folder and see if that works. If yes, you can be quite sure that the lustre mount causes the problems. Sorry that I cannot come up with an easy fix but hopefully we will figure it out step by step. Cheers, nldoc

bitbacchus commented 4 years ago

Did you have a look at the file permissions?

$ ls -al /lustre/ssd/ws/kw/tmp/netlogo-headless2c6f268a317f.sh

I'm asking because when the permissions of a file are incorrect, like:

$ touch script.sh
$ ls -al script.sh
drwxrwx---   2 user group 4096 Jun 22 17:38 .
drwxr-xr-x 108 user group 4096 Jun 22 17:38 ..
-rw-rw-r--   1 user group    0 Jun 22 17:36 script.sh

(the execution bit is missing here, permissions should be rwx, not rw) you would get a permission denied when calling the file directly but can still call it explicitly as a shell script with sh:

$ ./script.sh
sh: 6: ./script.sh: Permission denied
$ sh script.sh
$
kiranaw commented 4 years ago

Hi @nldoc ,

Thank you for the hints. I moved everything (tmp, nlpath, modelpath, out) to my personal home folder now,

unixtools::set.tempdir("/home/h4/kw/tmp")

netlogopath <- file.path("/home/h4/kw/NetLogo-6.0.4/")
modelpath <- file.path("/home/h4/kw/NetLogo-6.0.4/app/models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("/home/h4/kw/out/")

and got the same error:

sh: /home/h4/kw/tmp/netlogo-headless24753d76d05.sh: Permission denied
Error in util_gather_results(nl, outfile, seed, siminputrow) :
  Temporary output file /home/h4/kw/tmp/nlrx8413_1247775718d9.csvnot found. On unix systems this can happen if the default system temp folder is used.
                Try reassigning the default temp folder for this R session (unixtools package).

and as @bitbacchus mentioned above, in my case the permissions of the file is still incorrect:

-rw-r--r-- 1 kw 1111111 2310 Jun 23 08:19 /home/h4/kw/tmp/netlogo-headless24753d76d05.sh

how should I fix this permission issue?

best regards, kiranaw

nldoc commented 4 years ago

Hi @kiranaw, I did some tests on my linux server and was able to reproduce the error. The temporary sh files in the tmp folder will have the same user rights as the original netlogo-headless.sh file in your NetLogo folder. You reported that your temporary sh files have the -rw-r--r-- permission. So I assume this is also the case for the original netlogo-headless.sh in your NetLogo folder. Could you please check the user rights for the original netlogo-headless.sh file in your "/home/h4/kw/NetLogo-6.0.4/" folder? If it is set to -rw-r--r-- as well, you can add rights for execution with: chmod a+rx netlogo-headless.sh

I still don't know why your NetLogo installation does not have execution rights on the sh file by default. Did you use the download_netlogo() function from the nlrx package to load and unpack NetLogo?

I will also check the package code if we can make sure that the temporary sh files are always executable, even if the original one is not.

kiranaw commented 4 years ago

It worked!

You are right that in my case the the original netlogo-headless.sh file in my home folder had -rw-r--r-- permission. I wasn't aware of the download_netlogo() function. So, what I did was just copy the downloaded Netlogo folder from my local PC and then upload it to the cluster.

Thank you so much.

best regards, kiranaw

nldoc commented 4 years ago

Thats great! :) Thank you for pointing out this issue. While using the download_netlogo() is recommended and seem to fix this issue, I still added a small fix to the package. The user rights for the temporary sh files are now always set before execution. This fix will be pushed soon to CRAN with nlrx version 0.4.2. Cheers, nldoc