Closed Yehrol closed 1 year ago
Hi @LorisDeBiasi,
Thank you for providing a detailed description of the issue. I was able to reproduce the error.
It appears that the Sentinel-2 collection in the AWS V0 catalog has some missing images. For instance, only images from March were retrieved. However, when the same cube is created in the Microsoft Planetary Computer (MPC) catalog, all images for the period are retrieved.
# AWS cube
s2_cube_aws <- sits_cube(
source = "AWS",
collection = "SENTINEL-S2-L2A-COGS",
tiles = c("31TGM"),
bands = c("B01", "B02", "B03", "B04", "B05", "B06", "B07", "B08", "B8A", "B09", "B11", "B12", "CLOUD"),
start_date = as.Date("2018-01-01"),
end_date = as.Date("2018-03-31")
)
# Get cube timeline
sits_timeline(s2_cube_aws)
#> [1] "2018-03-21" "2018-03-26" "2018-03-29" "2018-03-31"
# MPC cube
s2_cube_mpc <- sits_cube(
source = "MPC",
collection = "SENTINEL-2-L2A",
tiles = c("31TGM"),
bands = c("B01", "B02", "B03", "B04", "B05", "B06", "B07", "B08", "B8A", "B09", "B11", "B12", "CLOUD"),
start_date = as.Date("2018-01-01"),
end_date = as.Date("2018-03-31")
)
# Get cube timeline
sits_timeline(s2_cube_mpc)
#> [1] "2018-01-03" "2018-01-05" "2018-01-08" "2018-01-10" "2018-01-13" "2018-01-15"
#> [7] "2018-01-18" "2018-01-20" "2018-01-23" "2018-01-25" "2018-01-28" "2018-01-30"
#> [13] "2018-02-02" "2018-02-04" "2018-02-07" "2018-02-09" "2018-02-12" "2018-02-14"
#> [19] "2018-02-17" "2018-02-19" "2018-02-22" "2018-02-24" "2018-02-27" "2018-03-01"
#> [25] "2018-03-04" "2018-03-06" "2018-03-14" "2018-03-16" "2018-03-19" "2018-03-21"
#> [31] "2018-03-24" "2018-03-26" "2018-03-29"
After regularizing the AWS cube:
s2_regular_cube_aws <- sits_regularize(
cube = s2_cube_aws,
output_dir = "./regular_cube/aws/",
res = 10,
period = "P5D",
progress = TRUE
)
# Get cube timeline
sits_timeline(s2_regular_cube_aws)
#> [1] "2018-03-21" "2018-03-26" "2018-03-31"
MPC cube regularization:
s2_regular_cube_mpc <- sits_regularize(
cube = s2_cube_mpc,
output_dir = "./regular_cube/mpc/",
res = 10,
period = "P5D",
progress = TRUE
)
# Get cube timeline
sits_timeline(s2_regular_cube_mpc)
#> [1] "2018-01-03" "2018-01-08" "2018-01-13" "2018-01-18" "2018-01-23" "2018-01-28"
#> [7] "2018-02-02" "2018-02-07" "2018-02-12" "2018-02-17" "2018-02-22" "2018-02-27"
#> [13] "2018-03-04" "2018-03-09" "2018-03-14" "2018-03-19" "2018-03-24" "2018-03-29"
The images retrieved by the AWS cube have a high cloud coverage and a small temporal interval. Sits package interpolates the cloud points to avoid null values in the time series. However, due to the limited number of cloud-free observations (as shown in the table below), the interpolation will result in repeated values within the time series.
# Get cloud cover for each date
unique(s2_cube$file_info[[1]][, c("date", "cloud_cover")])
#> # A tibble: 4 × 2
#> date cloud_cover
#> <date> <dbl>
#> 1 2018-03-21 47.7
#> 2 2018-03-26 81.9
#> 3 2018-03-29 98.3
#> 4 2018-03-31 89.7
points$time_series[[1]]
#> # A tibble: 3 × 9
#> Index B02 B03 B04 B05 B06 B07 B08 B11
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2018-03-21 0.0529 0.0719 0.072 0.129 0.215 0.244 0.277 0.221
#> 2 2018-03-26 0.0529 0.0719 0.072 0.129 0.215 0.244 0.277 0.221
#> 3 2018-03-31 0.0529 0.0719 0.072 0.129 0.215 0.244 0.277 0.221
For some reason, Deep Learning models cannot process this sequence of repeated values and return the reported error.
I recommend using the Microsoft Planetary Computer (MPC) catalog to obtain more images for regularization. I tested this approach in my environment using the same sits version (1.3), and everything worked as expected. With the MPC cube regularization, the timeline is more extensive and contains additional temporal information.
samples_csv_file <- read.csv(file = "./samplesOFS_gva_train_test.csv")
points_mpc <- sits_get_data(
cube = s2_regular_cube_mpc,
samples = samples_csv_file,
bands = c("B02", "B03", "B04", "B05", "B06", "B07", "B08", "B11"),
output_dir = "./samples/",
progress = TRUE
)
sits_timeline(points_mpc)
#> [1] "2018-01-03" "2018-01-08" "2018-01-13" "2018-01-18"
#> [5] "2018-01-23" "2018-01-28" "2018-02-02" "2018-02-07"
#> [9] "2018-02-12" "2018-02-17" "2018-02-22" "2018-02-27"
#> [13] "2018-03-04" "2018-03-09" "2018-03-14" "2018-03-19"
#> [17] "2018-03-24" "2018-03-29"
I hope this information is helpful in resolving the issue you reported.
Hi @OldLipe
Thank you for the quick reply.
I tried with MPC, but unfortunately I'm getting an error when calling sits_regularize.
From what I've read, MPC should by default provide a short-lived token but it looks like there's an issue with it.
> s2_cube <- sits_cube(
source = "MPC",
collection = "SENTINEL-2-L2A",
tiles = c("31TGM"),
bands = c("B01", "B02", "B03", "B04", "B05", "B06", "B07", "B08", "B8A", "B09", "B11", "B12", "CLOUD"),
start_date = as.Date("2018-01-01"),
end_date = as.Date("2018-03-31")
)
|======================================================================| 100%
> s2_regular_cube <- sits_regularize(
cube = s2_cube,
output_dir = here("regular_cube"),
res = 10,
period = "P5D",
progress = TRUE
)
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
|======================================================================| 100%
Tiles 31TGM (B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A) are missing or malformed and will be reprocessed.
Error: .gc_regularize: invalid mpc token. (!is.null(res_content) is not TRUE)
Hi @LorisDeBiasi,
Sorry for the delay in responding.
As you mentioned, MPC cubes have a limited duration token, so I think it is
more manageable to download the images first (sits_cube_copy()
function) and then regularize them later.
Describe the bug
Since I updated sits to 1.3, sits_train crash at the end of the 2nd epoch.
sits 1.3
sits 1.2
To Reproduce
The following code is the same for 1.3 or 1.2
I've attached the file samplesOFS_gva_train_test which is a sample of the file I use. samplesOFS_gva_train_test.csv
Additional context
sessionInfo() result
1.3
1.2