ropensci / tiler

Generate geographic and non-geographic map tiles from R
https://docs.ropensci.org/tiler
Other
64 stars 8 forks source link

temp file race conditions when invoking tile() in parallel #18

Closed achubaty closed 1 year ago

achubaty commented 1 year ago

Calling tile() in a parallel (i.e., multiple concurrent tile calls) produces the following problems:

  1. python 'missing file' errors on the nodes due to disappearance of the g2ttmp directory.

    g2t_tmp_dir <- file.path(tempdir(), "g2ttmp")
    ## and later
    unlink(g2t_tmp_dir, recursive = TRUE, force = TRUE)
    ...

    In order to make this work, each nodes needs its own tmpdir. The following change does the trick, provided the rasters are all of the supported type:

    g2t_tmp_dir <- tempfile("g2ttmp_")
  2. attempts to use the same temp file when rasters are not one of the supported filetypes:

    ## this fliename is hardcoded to be the same for all nodes :(
    file <- file.path(tempdir(), "tmp_raster.tif")

    Unfortunately this is a more involved fix because that file name is hardcoded in multiple places outside of tile(), so would need to pass the temp file name through every other function that checks for that file. Otherwise, I'd use tempfile("tmp_raster_", fileext = ".tif") here too.

leonawicz commented 1 year ago

This request I agree it would be great for parallel processing. I don't have time to dig into this one myself very soon. But if you can make a PR, I can review it, at least to get the process started.

Are you suggesting append a random string of characters to g2t_tmp_dir <- file.path(tempdir(), "g2ttmp_")?

It sounds like (2) can be addressed independently from (1).

Regards, Matt

achubaty commented 1 year ago

Yes, sorry I just edited my OP.

For 1) I suggest g2t_tmp_dir <- tempfile("g2ttmp_") to add the random characters; 2) can be done separately.

I have (1) implemented in my fork and I am testing with some of my projects. Will PR when I'm happy with it (though note my fork also implements #19).

leonawicz commented 1 year ago

Thank you! That sounds good to me, feel free to combine it all in one PR.

achubaty commented 1 year ago

update: changes for number 1 seem fine -- it's number 2 that's biting me still.

leonawicz commented 1 year ago

@achubaty Hi Alex, it's been a while; I know this was merged, but are we good to close the issue or is there still something outstanding? Regards, Matt

achubaty commented 1 year ago

Hi Matt, the first part is completed, but I haven't come back to deal with the second part of this issue ('attempts to use the same temp file when rasters are not one of the supported filetypes') - so this piece is still outstanding.

achubaty commented 1 year ago

@leonawicz Hi again, I started working on this but encountered an issue when running tests using the latest raster package. So this is currently a work in progress.

leonawicz commented 1 year ago

Hi, yeah I encountered a similar problem with a couple unit tests involving raster stacks last week and had to skip them for now.

There may be an option to remove raster from the package, exclusively in favor of terra, but I don't have the bandwidth to look into that and it would be a whole separate issue. Even the packaged test files created in data-raw/data.R use raster.

My next goal is to at least republish to CRAN the current version on master, to address critical dependency issues #22

If we can fold in completion of this issue with the next release, that would be a bonus, but it's also fine if we don't.

leonawicz commented 1 year ago

CRAN submission is offline until August 7. I will submit afterward. But everything looks to be passing. :)

achubaty commented 11 months ago

Hi, yeah I encountered a similar problem with a couple unit tests involving raster stacks last week and had to skip them for now.

There may be an option to remove raster from the package, exclusively in favor of terra, but I don't have the bandwidth to look into that and it would be a whole separate issue. Even the packaged test files created in data-raw/data.R use raster.

My next goal is to at least republish to CRAN the current version on master, to address critical dependency issues #22

If we can fold in completion of this issue with the next release, that would be a bonus, but it's also fine if we don't.

@leonawicz this has been fixed upstream in raster