ncsu-landscape-dynamics / rpops

PoPS (Pest or Pathogen Spread) R Package
https://ncsu-landscape-dynamics.github.io/rpops/
GNU General Public License v3.0
10 stars 5 forks source link

segfault with large files (type vrt, maybe others?) #115

Closed nkruskamp closed 3 years ago

nkruskamp commented 3 years ago

I'm going to open this ticket to track it for others if they have a similar issue, although I think this may be an issue with terra. When dealing with a really large weather coefficient .vrt file [2100 x 4650 pixels, 520 layers], I get the following error. I've encountered this on a Linux cluster and a Windows workstation when running pops_validate. Here's a traceback:

Traceback:
 1: .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: 0x560f65f4d4f0>,     dll = list(name = "Rcpp", path = "/usr/local/usrapps/jmgray2/rpops_env/lib/R/library/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x560f6601de90>,         info = <pointer: 0x560f63ce2770>), numParameters = -1L),     <pointer: 0x560f69a8c240>, <pointer: 0x560f6df67350>, .pointer,     ...)
 2: x@ptr$classify(as.vector(rcl), NCOL(rcl), right, include.lowest,     othersNA, opt)
 3: .local(x, ...)
 4: terra::classify(r, matrix(c(NA, 0), ncol = 2, byrow = TRUE),     right = NA)
 5: terra::classify(r, matrix(c(NA, 0), ncol = 2, byrow = TRUE),     right = NA)
 6: secondary_raster_checks(config$precipitation_coefficient_file,     infected)
 7: configuration(config)
 8: validate(infected_file = ffIn(paste("infection_rasters/", init_inf_year,     "_infections.tif", sep = "")), exposed_file = ffRun("exposure.tif"),     infected_years_file = ffRun("infection.vrt"), mask = ffIn("nlcd_mask.tif"),     host_file = ffIn("total_host_comp.tif"), total_populations_file = ffIn("max_host_comp.tif"),     precip = TRUE, precipitation_coefficient_file = ffRun("weather.vrt"),     number_of_iterations = 100, number_of_cores = n_cores, parameter_means = c(4.4,         20.57, 0.9947, 9504, 0, 0), parameter_cov_matrix = matrix(ncol = 6,         nrow = 6, 0), model_type = "SEI", latency_period = 51,     time_step = "week", start_date = paste(start_year, "-01-01",         sep = ""), end_date = paste(end_year, "-12-31", sep = ""),     mortality_on = TRUE, mortality_rate = 0.05, mortality_time_lag = 2,     success_metric = "quantity and configuration", start_exposed = TRUE,     )
An irrecoverable exception occurred. R is aborting now ...
/home/nfkruska/.lsbatch/1621362850.167374.shell: line 17:  6711 Segmentation fault      (core dumped) Rscript --vanilla ./7b_pops_validation.R "/rsstu/users/j/jmgray2/SEAL/nfkruska/rpops_calibration" 0200 10_year_mask 2002 10 TRUE 1

This appears to happen when terra tried to reclass NA pixels to 0. I am looking at if this impacts other raster file types such as GTiff.

There is a core dump file generated from the cluster, but I don't know how to examine it to understand the problem better.

nkruskamp commented 3 years ago

This also occurs with .tif files. I am still unable to identify why it is happening.

nkruskamp commented 3 years ago

Submitted an issue w/ terra

rspatial/terra#237

ChrisJones687 commented 3 years ago

I am assuming that this happens if just opening the file and classifying NA's to 0s?

nkruskamp commented 3 years ago

Yes exactly. The sample code I shared in the terra issue creates a random noise raster and then tries to reclassify. Did we add reclassify in rpops recently? I assumed it is a terra update that has caused this.

ChrisJones687 commented 3 years ago

we moved from raster::reclassify to terra::classify which hasn't caused issues until now. Orginally I was checking for NA values in cells then setting those to 0 but that was about 2x longer than reclassify was that might be an option but is significantly slower.

nkruskamp commented 3 years ago

This was an issue with using the TILED creation option with very large tif files.