e-sensing / sits

Satellite image time series in R
https://e-sensing.github.io/sitsbook/
GNU General Public License v2.0
481 stars 78 forks source link

sits_apply issue: invalid merge_files variable - unable to merge blocks to write to output file #1216

Closed etiennelalechere closed 2 weeks ago

etiennelalechere commented 2 months ago

Hello,

-I have an issue when computing a vegetation index with the sits_apply function. Here's the bug:

error/purrr_error_indexed>

Error in purrr::map():

ℹ In index: 1.

Caused by error in .check_remote_errors():

! 7 nodes produced errors; first error: .raster_merge_blocks: invalid merge_files variable - unable to merge blocks to write to output file

Backtrace:

1. ├─sits::sits_apply(...)

2. ├─sits:::sits_apply.raster_cube(...)

3. │ └─sits:::.jobs_map_parallel_dfr(...)

4. │ └─sits:::.jobs_map_parallel(jobs, fn, ..., progress = progress)

5. │ ├─base::unlist(...)

6. │ └─purrr::map(...)

7. │ └─purrr:::map_("list", .x, .f, ..., .progress = .progress)

8. │ ├─purrr:::with_indexed_errors(...)

9. │ │ └─base::withCallingHandlers(...)

10. │ ├─purrr:::call_with_cleanup(...)

11. │ └─sits (local) .f(.x[[i]], ...)

12. │ └─sits:::.parallel_map(round, fn, ..., progress = progress)

13. │ └─sits:::.parallel_cluster_apply(x, fn, ..., pb = pb)

14. │ └─parallel (local) .check_remote_errors(val)

15. │ └─base::stop(...)

16. └─base::.handleSimpleError(...)

17. └─purrr (local) h(simpleError(msg, call))

18. └─cli::cli_abort(...)

19. └─rlang::abort(...)

-And here is the code to reproduce the error:

Define roi

roi <- c( lon_min = 5.558055, lat_min = 45.845646, lon_max = 5.784589, lat_max = 46.152492 )

Retrieve a deep time-serie for few tiles

local_dir <- "D:/SITS/Test_HLS_cube/" start_dates <- c("1984-06-01", "1985-06-01", "1986-06-01", "1987-06-01", "1988-06-01", "1989-06-01", "1990-06-01", "1991-06-01", "1992-06-01", "1993-06-01", "1994-06-01", "1995-06-01", "1996-06-01", "1997-06-01", "1998-06-01", "1999-06-01", "2000-06-01", "2006-06-01", "2002-06-01", "2003-06-01", "2004-06-01", "2005-06-01", "2006-06-01", "2007-06-01", "2008-06-01", "2009-06-01", "2010-06-01", "2011-06-01", "2012-06-01", "2013-06-01", "2014-06-01", "2015-06-01", "2016-06-01", "2017-06-01", "2018-06-01", "2019-06-01", "2020-06-01", "2021-06-01", "2022-06-01", "2023-06-01", "2024-06-01") end_dates <- c("1984-08-31", "1985-08-31", "1986-08-31", "1987-08-31", "1988-08-31", "1989-08-31", "1990-08-31", "1991-08-31", "1992-08-31", "1993-08-31", "1994-08-31", "1995-08-31", "1996-08-31", "1997-08-31", "1998-08-31", "1999-08-31", "2000-08-31", "2008-08-31", "2002-08-31", "2003-08-31", "2004-08-31", "2005-08-31", "2006-08-31", "2007-08-31", "2008-08-31", "2009-08-31", "2010-08-31", "2011-08-31", "2012-08-31", "2013-08-31", "2014-08-31", "2015-08-31", "2016-08-31", "2017-08-31", "2018-08-31", "2019-08-31", "2020-08-31", "2021-08-31", "2022-08-31", "2023-08-31", "2024-08-31")

Build a cube for each year and save it locally

length(start_dates) for (i in 1:length(start_dates)){ print(i) MPC_cube_year <- sits_cube( source = "MPC", collection = "LANDSAT-C2-L2", roi = roi, bands = c("RED", "NIR08", "CLOUD"), start_date = start_dates[i], end_date = end_dates[i] )

copy the yearly cube to a local directory

MPC_cube_year <- sits_cube_copy( cube = MPC_cube_year, output_dir = local_dir

multicores = 16

) }

Remove tile with only 5 images

str(MPC_cube) MPC_cube <- MPC_cube[c(1,2),]

Regularize the cube to 1 years intervals

MPC_cube_reg <- sits_regularize( cube = MPC_cube, output_dir = "D:/SITS/Test_MPC_cube/MPC_cube_reg", res = 30, period = "P1Y", multicores = 1, progress = T ) summary(MPC_cube_reg) print(MPC_cube_reg$file_info[[1]] , n = 100) # the cloud band is lost sits_timeline(MPC_cube_reg)

Temporal extent of the cube does not align with dt and has been extended to 1984-01-01/1984-12-31

--> Also I wondering what can be the reason for this warning, does it means that there is a lack of images for the period? <--

Vegetation index for the regularized cube

MPC_cube_NDVI <- sits_apply(MPC_cube_reg, NDVI = (NIR08 - RED) / (NIR08 + RED), normalized = F, output_dir = "D:/SITS/Test_MPC_cube/MPC_cube_NDVI/",

multicores = 2,

                 progress = T

)

-Here's the output of sessionInfo():

sessionInfo() R version 4.3.3 (2024-02-29 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale: [1] LC_COLLATE=French_France.utf8 LC_CTYPE=French_France.utf8 LC_MONETARY=French_France.utf8 LC_NUMERIC=C
[5] LC_TIME=French_France.utf8

time zone: Europe/Paris tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] leaflet_2.2.2 dplyr_1.1.4 sits_1.5.1

loaded via a namespace (and not attached): [1] vctrs_0.6.5 cli_3.6.2 rlang_1.1.3 DBI_1.2.3 KernSmooth_2.23-22 purrr_1.0.2 generics_0.1.3
[8] sf_1.0-16 glue_1.7.0 htmltools_0.5.7 e1071_1.7-14 fansi_1.0.6 grid_4.3.3 crosstalk_1.2.1
[15] classInt_0.4-10 tibble_3.2.1 fastmap_1.1.1 yaml_2.3.8 lifecycle_1.0.4 compiler_4.3.3 htmlwidgets_1.6.4 [22] timechange_0.3.0 Rcpp_1.0.12 pkgconfig_2.0.3 rstudioapi_0.15.0 digest_0.6.34 R6_2.5.1 class_7.3-22
[29] tidyselect_1.2.0 utf8_1.2.4 pillar_1.9.0 parallel_4.3.3 magrittr_2.0.3 tools_4.3.3 proxy_0.4-27
[36] lubridate_1.9.3 units_0.8-5

-Do you have any ideas- to solve this bug?

Thank you in advance for your help.

Cheers,

Etienne

OldLipe commented 1 month ago

Dear @etiennelalechere ,

Thank you for your contribution and for providing a reproducible example.

We will investigate the issue reported and return to you as soon as possible.

Thank you! Felipe Carvalho

OldLipe commented 1 month ago

Dear @etiennelalechere,

Thank you for reporting the issue!
I was unable to reproduce the error. However, I have some suggestions on how to resolve it.

--> Also I wondering what can be the reason for this warning, does it means that there is a lack of images for the period? <-- This message is produced by the gdalcubes package. gdalcubes considers the entire year as a valid timeline, from January to December, due to the aggregation period.

This error generally occurs when some images are corrupted after download. This can happen due to various factors, such as network instability or too many consecutive requests to the provider.

One way to handle consecutive requests is by adding a sleep between each iteration in your loop.
Below, I have shared the same code with the sleep suggestion included.
In this code, we use only the years 1985 and 1986, but I believe it is applicable to all other years as well.

#
# Defining ROI region
#
roi <- c(
    lon_min = 5.558055, lat_min = 45.845646,
    lon_max = 5.784589, lat_max = 46.152492
)

#
# Directory to store cube images
#
local_dir <- "~/issues/1216/data/" # CHANGE ME

#
# Start date vector
#
start_dates <- c("1984-06-01", "1985-06-01", "1986-06-01", "1987-06-01", "1988-06-01", "1989-06-01", "1990-06-01", "1991-06-01", "1992-06-01", "1993-06-01", "1994-06-01", "1995-06-01", "1996-06-01", "1997-06-01", "1998-06-01", "1999-06-01",
                 "2000-06-01", "2006-06-01", "2002-06-01", "2003-06-01", "2004-06-01", "2005-06-01", "2006-06-01", "2007-06-01", "2008-06-01", "2009-06-01", "2010-06-01", "2011-06-01", "2012-06-01", "2013-06-01", "2014-06-01", "2015-06-01", "2016-06-01", "2017-06-01",
                 "2018-06-01", "2019-06-01", "2020-06-01", "2021-06-01", "2022-06-01", "2023-06-01", "2024-06-01")
#
# End date vector
#
end_dates <- c("1984-08-31", "1985-08-31", "1986-08-31", "1987-08-31", "1988-08-31", "1989-08-31", "1990-08-31", "1991-08-31", "1992-08-31", "1993-08-31", "1994-08-31", "1995-08-31", "1996-08-31", "1997-08-31", "1998-08-31", "1999-08-31",
               "2000-08-31", "2008-08-31", "2002-08-31", "2003-08-31", "2004-08-31", "2005-08-31", "2006-08-31", "2007-08-31", "2008-08-31", "2009-08-31", "2010-08-31", "2011-08-31", "2012-08-31", "2013-08-31", "2014-08-31", "2015-08-31", "2016-08-31", "2017-08-31",
               "2018-08-31", "2019-08-31", "2020-08-31", "2021-08-31", "2022-08-31", "2023-08-31", "2024-08-31")

#
# Download MPC images from 1985 and 1986
#
for (i in seq_len(2)){
    print(i)
    #
    # Create a MPC cube
    #
    MPC_cube_year <- sits_cube(
        source = "MPC",
        collection = "LANDSAT-C2-L2",
        roi = roi,
        bands = c("RED", "NIR08", "CLOUD"),
        start_date = start_dates[i],
        end_date = end_dates[i]
    )

    #
    # Download images
    #
    MPC_cube_year <- sits_cube_copy(
        cube = MPC_cube_year,
        output_dir = local_dir,
        multicores = 4
    )
    #
    # Sleep for 10 seconds
    #
    Sys.sleep(10)
}

#
# Create a local cube
#
MPC_cube <- sits_cube(
    source = "MPC",
    collection = "LANDSAT-C2-L2",
    data_dir = local_dir
)

#
# Select two tiles
#
MPC_cube <- MPC_cube[c(1,2),]

#
# Directory to store regularized images
#
reg_dir <- "~/issues/1216/reg/"  # CHANGE ME

#
# Regularize local cubes
#
MPC_cube_reg <- sits_regularize(
    cube = MPC_cube,
    output_dir = reg_dir,
    res = 30,
    period = "P1Y",
    multicores = 4,
    progress = T
)
sits_timeline(MPC_cube_reg)
# > [1] "1984-01-01" "1985-01-01"

#
# Create NDVI Index
#
MPC_cube_NDVI <- sits_apply(MPC_cube_reg,
                            NDVI = (NIR08 - RED) / (NIR08 + RED),
                            normalized = F,
                            output_dir = reg_dir,
                            multicores = 4,
                            progress = T
)

Please let us know if the error persists. Thank you very much!

etiennelalechere commented 3 weeks ago

Dear Felipe,

Yes, the error persists, probably because it comes from the function sits_apply so the change in the sits_cube loop finally has no effect. Any other ideas about this issue?

You wrote that:

"--> Also I wondering what can be the reason for this warning, does it means that there is a lack of images for the period? <-- This message is produced by the gdalcubes package. gdalcubes considers the entire year as a valid timeline, from January to December, due to the aggregation period."

It means that the entire years from 01/01 to 31/12 are used for all the years from 1984 to 2024, am I right?

Cheers, Etienne

gilbertocamara commented 3 weeks ago

Dear @etiennelalechere we are working on a more long-lasting solution to your problem. Please see issue https://github.com/e-sensing/sits/issues/1231

gilbertocamara commented 2 weeks ago

Dear @etiennelalechere we have solved your problem. The correction is currently in the development version of sits and we will load the corrections to CRAN on November 20th. To get the latest version of sits, please run the command

% devtools::install_github("e-sensing/sits@dev")