mc_agg() doesn't work and I can't figure out why

MarijkeThoonen commented 7 months ago

Hi, It's my first issue so pardon my perhaps unconventional way of charing this issue. Thanx in advance for treating this issue. I have a data-set with TMS4-loggers and can't generate an output with mc_agg. Here the output of my script:

> ft <- read.table("./files_table_2023.csv", sep=";", header = T)
> lt <- read.csv2("./localities_table_2023.csv", sep=";", dec = ",")
> tms.m <- mc_read_data(files_table = ft,
+                       localities_table =lt,
+                       silent = T,
+                       clean = T) #constant time-step; dupes removed; missing values NA.
Error in `purrr::pmap()`:
ℹ In index: 1.
Caused by error:
! assignment of an object of class “integer” is not valid for @‘locality_id’ in an object of class “mc_LocalityMetadata”; is(value, "character") is not TRUE
Run `rlang::last_error()` to see where the error occurred.
> tms.m_filter <- mc_filter(tms.m, localities = "1", sensors = "TMS_T1")
> mc_save(tms.m_filter, "./afgeleide_data/example")
> info <- mc_info(tms.m_filter)
> head(info)
  locality_id serial_number sensor_id sensor_name          start_date            end_date
1           1      94253063    TMS_T1      TMS_T1 2023-03-16 13:30:00 2023-11-08 14:45:00
  step period min_value max_value count_values count_na
1  300   <NA>     6.125   28.6875        43759    24513
> test <- myClim::mc_agg(tms.m_filter,
+                fun = "min",
+                period = "day")
> info_2 <- mc_info(test)
> head(info_2)
  locality_id serial_number sensor_id sensor_name start_date   end_date  step period
1           1          <NA>    TMS_T1  TMS_T1_min 2023-03-16 2023-11-08 86400    day
  min_value max_value count_values count_na
1        NA        NA            0      238

Jules- commented 7 months ago

I can't provide a definitive diagnosis without access to your source files. It seems that the values in the locality_id column are entirely numerical, which might be causing it to be parsed as an integer type. Since mc_read_data function expects the locality_id to be a character vector, you can try explicitly converting the values to characters.

ft$locality_id <- as.character(ft$locality_id)
# or
lt$locality_id <- as.character(lt$locality_id)

manmatej commented 7 months ago

Next thing could be in NAs. From the output of mc_info() it seems you have a lot of missing values in your data. Step 300s is something I would be careful about with TOMST loggers. This step is not impossible, but definitely not default. It would be good to double check the NAs first.

Then in mc_agg() default parameter min_coverage = 1, that means if you aggregate to days and within the day you have one or more missing values you will get NA from aggregation. You can use lower min_coverage if this is suitable for your analysis e.g. min_coverage = 0.9 to allow 10 % missing values within each day.

You can upload here the example part of you TOMST file and we can investigate deeper.

MarijkeThoonen commented 7 months ago

Hi there, thaks for the tips. It works now. I provided step = 900 in the files table wich rules out the NA's. I allso noticed that the sensor name changes when you use mc_agg. i.e. tms.day <- mc_agg(tms, fun = "max", min_coverage = 0.9, period = "day") TMS_T1 becomes TMS_T1_max so you have to provide this when you plot p <- mc_plot_line(tms_day_max, filename = "lines.pdf", sensors = c("TMS_T3_max", "TMS_T2_max"), #watch out: sensor naam changes by using mc_agg )

martin-kopecky commented 7 months ago

Hi Marijke,

Good to see that it helps. Regarding the issue with the "renaming" - it was our deliberate choice to include the aggregation statistics into the variable name after the aggregation. We think that this way, the name of the aggregated variable is more informative (and it also serves as a warning for the user that she/he works with already aggregated variable).

MarijkeThoonen commented 7 months ago

Thanks. Maybe it's informative to write in this in the 'Discription' of mc_agg in the help?

martin-kopecky commented 7 months ago

Ok, good suggestion. We will do that. Thanks!

manmatej commented 6 months ago

Actually it is already quite clearly written in help of mc_agg

mc_agg returns new sensors on the localities putting aggregation function in its name (TMS_T1 -> TMS_T1_max),

ibot-geoecology / myClim

mc_agg() doesn't work and I can't figure out why #9