e-sensing / sits

Satellite image time series in R
https://e-sensing.github.io/sitsbook/
GNU General Public License v2.0
470 stars 76 forks source link

Error in model training stage #399

Closed firmanhadi closed 3 years ago

firmanhadi commented 3 years ago

Dear all,

I have been trying to implement the demo files with my own location and samples. The structure of my samples can be seen in the attachment.

I found error when I tried to run the training model, mlp_model <- sits_train(samples_s2_3bands, ml_method = sits_mlp())

The error is shown below, Error: sits_train: Samples have different timeline lengths Use.sits_tibble_prune or sits_fix_timeline (.sits_timeline_check(data) == TRUE is not TRUE)

I have tried to look at the documentation and the sits book but could not find any clue to solve the issue.

Any help would be very much appreciated.

Thank you.

Regards,

Firman. # Screenshot from 2021-09-30 12-02-21

gilbertocamara commented 3 years ago

Hi, @firmanhadi could you please send us your entire script?

firmanhadi commented 3 years ago

Hi, @firmanhadi could you please send us your entire script?

Hi @gilbertocamara,

Please find below the script, it's based on _classify_sentinel2rfor.R I have tried also adding one column at the beginning for the samples (id column), but still failed (step to create _mlpmodel). Thanks.

Firman.

Sys.setenv("AWS_ACCESS_KEY_ID"     = "", "AWS_SECRET_ACCESS_KEY" = "", "AWS_DEFAULT_REGION"    = "ap-southeast-1", "AWS_ENDPOINT"          = "ec2-southeast-1.amazonaws.com")

library(sits)

s2_cube_2 <- sits_cube(source = "AWS",
                     name = "49MDM_49MEM_2021",
                     collection = "sentinel-s2-l2a",
                     tiles = c("49MDMLKP", "49MEM"),
                     bands = c( "B04", "B08", "B11", "SCL"),
                     roi = "roi.shp" ,
                     start_date = as.Date("2021-05-03"),
                     end_date = as.Date("2021-06-27"),
                     s2_resolution = 60
)

samples_S2_49MDM_2021 <- sits_get_data(s2_cube_2, file = "sampel3.csv", multicores=8)

samples_s2_3bands <- sits_select(samples_S2_49MDM_2021,
                                 bands = c("B04", "B08", "B11"))

mlp_model <- sits_train(samples_s2_3bands,
                        ml_method = sits_mlp()
)

s2_probs <- sits_classify(s2_cube,
                          mlp_model,
                          memsize = 6,
                          multicores = 4,
                          output_dir = tempdir()
)

plot(s2_probs)

s2_bayes <- sits_smooth(s2_probs, output_dir = tempdir())

s2_label <- sits_label_classification(s2_bayes, output_dir = tempdir())

plot(s2_label)
gilbertocamara commented 3 years ago

Dear @firmanhadi, the problem is that you are creating a single data cube which has two Sentinel-2 tiles ("49MDMLKP", and "49MEM"). These tiles have different timelines.

sits_timeline(s2_cube_2[1,]) [1] "2021-05-03" "2021-05-06" "2021-05-08" "2021-05-11" "2021-05-13" "2021-05-16" [7] "2021-05-18" "2021-05-21" "2021-05-23" "2021-05-26" "2021-05-28" "2021-05-31" [13] "2021-06-02" "2021-06-05" "2021-06-07" "2021-06-10" "2021-06-12" "2021-06-15" [19] "2021-06-17" "2021-06-20" "2021-06-22" "2021-06-25" "2021-06-27"

sits_timeline(s2_cube_2[2,]) [1] "2021-05-03" "2021-05-08" "2021-05-13" "2021-05-18" "2021-05-23" "2021-05-28" [7] "2021-06-02" "2021-06-07" "2021-06-12" "2021-06-17" "2021-06-22" "2021-06-27"

The first tile ("49MDMLKP") has 23 temporal instances in the chosen period ("2021-05-03" to "2021-06-27"), but the second time ("49MEM") has 12 instances. In order to use both tiles together for classification, you need to produce a regular cube. Use the "sits_regularize()" function. For example:

s2_regular_cube <- sits_regularize( cube = s2_cube_2, name = "49MDM_49MEM_2021_regular", period = "P15D", res = 60, agg_method = "bilinear", cloud_mask = TRUE, multicores = 4 )

Please see the documentation on "sits_regularize" or see the documentation in https://e-sensing.github.io/sitsbook/earth-observation-data-cubes.html#regularizing-data-cubes

firmanhadi commented 3 years ago

Dear @gilbertocamara,

Thank you very much for the explanation. Now I have more understanding on sits_regularize.

Based on your input, I used this code below, but found a new error. FYI, I have loaded gdalcubes and defined roi variable.

s2_regular_cube <- sits_regularize( cube = s2_cube_2, name = "49MDM_49MEM_2021_regular", dir_images = tempdir(), roi = roi, period = "P15D", res = 60, agg_method = "bilinear", cloud_mask = TRUE )

`Creating database of images... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **|

Error in gdalcubes::cube_view(extent = list(left = roi$left, right = roi$right, : is.character(resampling) is not TRUE`

firmanhadi commented 3 years ago

Dear @gilbertocamara,

The error was solved after I updated the package to the latest version. Anyway, a new error message appeared after I ran this line below,

s2_cube_regular <- sits_regularize( cube = s2_cube, name = "T49MDM_MEM_2021", output_dir = tempdir(), period = "P15D", res = 60, agg_method = "median", cloud_mask = TRUE )

Error: sits_cube: invalid 'source' parameter (length should be == 1)

gilbertocamara commented 3 years ago

Dear @firmanhadi There was a bug in the sits_regularize() function, which has been corrected. Please download the new version (sits 0.14.1-2) from github. Sorry for the inconvenience.

firmanhadi commented 3 years ago

Dear @gilbertocamara,

Thank you very much again for the help. I have finally produced the classification result.

Regards,

Firman.