Nowosad / supercells

The goal of supercells is to utilize the concept of superpixels to a variety of spatial data.
https://jakubnowosad.com/supercells/
GNU General Public License v3.0
66 stars 5 forks source link

Chunks get "lost" at coords (0,0) to (1,1) #17

Closed micha-silver closed 2 years ago

micha-silver commented 2 years ago

I am using the new chunks option of supercells() on a somewhat large raster to overcome out of memory problems. I have found that after the function completes sucessfully, some of the chunks have "disappeared" from the extent of the original raster stack. The lost chunks are actually left at coordinates (0,0) to (1,1), so if I zoom way down to scale 1:2 and to coord (0,0) I can see them, all overlayed.

What's more, if I run supercells() repeated with the same parameters, and save the result each time, the set of correctly placed chunks changes each time! See image below with three sample runs. The light colored areas are the superpixel boundaries in each case. And all three runs produced the same number of polygons in the result sf object.

Three_supercell_runs

This phenomena occurred with two different satellite images, and on two separate computers (both Debian, R versions 4.0.5 and 4.1.2).

I did some tests with various parameter values to try to isolate what is causing this. I ran supercells() both as a single process and parallelized (future=TRUE and future=FALSE) but the problem persisted in both cases (With future enabled the run completed in 1/3 the time). Disabling the connectivity (clean=FALSE) did not change anything. And various values of minarea and step also did not solve the problem.

However I did notice that when I choose a small value for chunks then usually the superpixels vector is created correctly, with all chunks in the right place. "Small" chunks were less than 1/2 of the smallest dimension of the raster. i.e. cluster_size <- min(dim(stk)[1:2])/2 But if I do a repeat run with the small chunk size, sometimes the issue with chunks "left behind" in the (0,0) to (1,1) region happens again. :-(

Any suggestions where to go from here? I don't have any knowledge of C++, but it looks like something weird is happening in the run_slic function.

Nowosad commented 2 years ago

@micha-silver could you share your code and data with me? (email is fine)

micha-silver commented 2 years ago

Sure, I'll send the data privately. Here's the test code I used:

library(supercells)
library(terra)
library(sf)
library(future)

# Set path to data directory
Data_dir = "/media/micha/Storage_8TB/Data/EO/Test"
tif_file <- file.path(Data_dir, "namibia_full.tif")
#tif_file <- file.path(Data_dir, "israel_cropped.tif")
stk <-  rast(tif_file)
# Suggested chunk size
chunk_size <- ceiling(min(dim(stk)[1:2]) / 2) + 1
print(paste("Suggested chunk size:", chunk_size))

#---------------------------
# Change here to use future
#---------------------------
use_future = FALSE
if (use_future == TRUE) {
  library(future.apply)
  library(parallel)
  # Use 1/2 of the cores
  ncores <- detectCores() / 2
  cl <- makeCluster(ncores)
}
#-------------------------
# Repeat supercells() three times, save each sf output
#---------------------------
sp_list <- lapply(1:3, function(ii) {
  t1 = Sys.time()
  cat("\n")
  print(paste(t1, "-- Starting, ", "chunks =", chunk_size ))
  if (use_future == TRUE) {future::plan(cluster, workers=cl)}
  capture.output(superpixels <- supercells(x=stk, step=20,
                                           compactness=2, dist_fun="jsd",
                                           minarea=24, chunks=chunk_size,
                                           future = use_future),
                 file = "/dev/null")
  future::plan(sequential)

  # Print some details to console 
  print(st_bbox(superpixels))
  print(paste("Num features:", length(superpixels$geometry)))
  t2 = Sys.time()
  elapsed = difftime(t2, t1, units = "mins")
  print(paste(t2, "-- Finished in", round(elapsed, 2), "mins"))

  # cleanup /tmp before next loop
  tmp_dir <- tempdir()
  tmp_files <- list.files(tmp_dir, full.names = T, pattern = "^spat")
  file.remove(tmp_files)

  # Save geopackage
  gpkg_file <- paste0("superpixels_", as.character(ii), ".gpkg")
  gpkg_path <- file.path(Data_dir, gpkg_file)
  st_write(superpixels, gpkg_path, append=FALSE)

  return(superpixels)
})
if (use_future == TRUE) {stopCluster(cl)}
Nowosad commented 2 years ago

Could you also send me one of the output superpixels vectors (e.g., as a gpkg file)?

Nowosad commented 2 years ago

Hi @micha-silver - thanks a lot for this discussion and the code.

I ran your code on two computers (Red Hat and Fedora) with and without the future option, but (sadly) was unable to reproduce your issue. My overall comments:

  1. Parallelization works -> it resulted in the decrease of the calculation time from 206 minutes to 59 minutes in this case.
  2. Your issue does not seem to be related to parallelization - therefore, you can do any further tests with future=TRUE to make them faster.
  3. This issue also does not seem to be related to (my) C++ code. The internal C++ code treats the input data as a matrix, without any spatial information (e.g., about the extent), and this information is added later in the R code. See https://github.com/Nowosad/supercells/blob/754c83ebf19652d2f2eb9a75117590990079d377/R/supercells.R#L147-L149. In short - slic[[1]] is a matrix output of the C++ code, I convert it into SpatRaster, next I replace all -1 values with NA, and finally force this object to have the same extent as the input chunk. My best guess is that the last line above is the source of your problem. Somehow (?), the proper extent is not added (extent 0-1, 0-1 is the terra default).
  4. Your issue is also very interesting, because I also use the terra::ext(x) code later - see https://github.com/Nowosad/supercells/blob/754c83ebf19652d2f2eb9a75117590990079d377/R/supercells.R#L156-L157. The role of this code is to update coordinates of the superpixels centroids, from column and row numbers to coordinates. Strangely, x and y columns were correct in your .gpkg files with incorrect extensions...
  5. To conclude the above points, the culprit seems to be the following line https://github.com/Nowosad/supercells/blob/754c83ebf19652d2f2eb9a75117590990079d377/R/supercells.R#L149. However, I have no clue why it happens there (pointer is getting lost?; a terra bug in writing extent info?).
  6. If this issue only happens on one computer, then maybe it would be good to try to find all of the differences between this computer and the rest (GDAL?, terra?, RAM size?).
  7. The last thing I can still try is to move the code extracting extent a couple of lines earlier... I will add a message here when it is ready.
  8. Let me know if you have any other suggestions.
Nowosad commented 2 years ago

Ad 7 -> I have added a tiny update to the chunks branch of the repo.

micha-silver commented 2 years ago

Again thanks for the thorough testing and explanations. I'll get your latest update and redo some testing. I want to also investigate some changes to virtual memory settings. Maybe some stuff is getting swapped out and then lost somehow. (That might explain the random nature of these missed extents). I'll also raise an issue on the terra github page with Robert Hijmans. Maybe he will have some insight. Cheers, Micha

rhijmans commented 2 years ago

I believe I can reproduce the issue:

library(supercells)
library(terra)
#terra version 1.4.21
library(sf)
#Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1

vol = rast(system.file("raster/volcano.tif", package = "supercells"))
s1 = supercells(vol, k = 50, compactness = 1)

terraOptions(todisk=TRUE)
s2 = supercells(vol, k = 50, compactness = 1)

st_bbox(s1)
#   xmin    ymin    xmax    ymax 
#2667400 6478705 2668010 6479575 
t_bbox(s2)
#xmin ymin xmax ymax 
#   0    0    1    1 

That would explain why this happens with large files only, and cannot always be reproduced between computers as the amount of RAM available varies. terraOptions(todisk=TRUE) essentially emulates the behavior you get with SpatRaster objects that are deemed too large to entirely load the cell values in RAM.

Nowosad commented 2 years ago

Thank you @rhijmans - this would be very helpful in debugging the problem.

rhijmans commented 2 years ago

Let me know if you want me to dig further, but it is probably easier for you to look at this now than it is for me.

Nowosad commented 2 years ago

Yes - now as I should be able to reproduce this issue - I will look into this issue in the next few days.

Nowosad commented 2 years ago

Hi @micha-silver, this bug seems to be fixed:

library(supercells)
library(terra)
#> terra version 1.4.21
library(sf)
#> Linking to GEOS 3.9.2, GDAL 3.3.2, PROJ 8.2.0
vol = rast(system.file("raster/volcano.tif", package = "supercells"))
s1 = supercells(vol, k = 50, compactness = 1)

terraOptions(todisk=TRUE)
s2 = supercells(vol, k = 50, compactness = 1)

st_bbox(s1)
#>    xmin    ymin    xmax    ymax 
#> 2667400 6478705 2668010 6479575
st_bbox(s2)
#>    xmin    ymin    xmax    ymax 
#> 2667400 6478705 2668010 6479575

Could you now try to reinstall {supercells} from the master branch and rerun your calculations to see if the {supercells} problem is fixed?

micha-silver commented 2 years ago

Yes, I'll rerun the test tomorrow.

On Mon, Nov 22, 2021, 11:45 Jakub Nowosad @.***> wrote:

Hi @micha-silver https://github.com/micha-silver, this bug seems to be fixed:

library(supercells) library(terra)#> terra version 1.4.21 library(sf)#> Linking to GEOS 3.9.2, GDAL 3.3.2, PROJ 8.2.0vol = rast(system.file("raster/volcano.tif", package = "supercells"))s1 = supercells(vol, k = 50, compactness = 1)

terraOptions(todisk=TRUE)s2 = supercells(vol, k = 50, compactness = 1)

st_bbox(s1)#> xmin ymin xmax ymax #> 2667400 6478705 2668010 6479575 st_bbox(s2)#> xmin ymin xmax ymax #> 2667400 6478705 2668010 6479575

Could you now try to reinstall {supercells} from the master branch and rerun your calculations to see if the {supercells} problem is fixed?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Nowosad/supercells/issues/17#issuecomment-975338465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5J52FTCSFDPIXZ7B24IH3UNIGL7ANCNFSM5H6TDOCA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

micha-silver commented 2 years ago

Pleased to report: the problem seems to be solved. I tested 4 cycles on two different machines, with results as expected. Many thanks.