robwschlegel / heatwaveR

This GitHub repo contains all of the code for the heatwaveR package.
https://robwschlegel.github.io/heatwaveR/
Other
45 stars 16 forks source link

Problem loading OISST_dat in parallel #40

Closed EnoLec closed 1 month ago

EnoLec commented 1 month ago

Hi,

I am running into an issue while trying to execute your code:

OISST_dat <- plyr::ldply(.data = OISST_files, .fun = OISST_load, .parallel = T, lon1 = 270, lon2 = 320, lat1 = 30, lat2 = 50) Error in do.ply(i) : task 1 failed - "could not find function "%>%"" In addition: Warning messages: 1: : ... may be used in an incorrect context: ‘.fun(piece, ...)’ 2: : ... may be used in an incorrect context: ‘.fun(piece, ...)’

The error suggests a problem with the function "%>%", which should not be an issue here...

I also tried .parallel = F and I got a different type of issue suggesting something else:

Error in OISST_files$file_name : $ operator is invalid for atomic vectors

I'm getting a little bit confused there and don't really know what to try to fix this. Any idea would be of great help!

Many thanks, Enora

EnoLec commented 1 month ago

Actually, the error message below

Warning messages: 1: : ... may be used in an incorrect context: ‘.fun(piece, ...)’ 2: : ... may be used in an incorrect context: ‘.fun(piece, ...)’

also happened when I run:

base::system.time(plyr::l_ply(OISST_files$file_name, .fun = OISST_url_daily_dl, .parallel = T))

and this error disappears when '.parallel = F', so this error is due to parallel but doesn't break the code.

It seems that the only thing that is holding me back is the function "%>%" even though the packages are loaded... Probably a rookie error, aha! Any thoughts?

Many thanks, Enora

robwschlegel commented 1 month ago

Hello, This sounds likely to be a version issue. Have you ensured that R, RStudio, and all of your packages are fully up-to-date? Another possible issue is that the pipe operator %>% was replaced a while back with a new native pipe operator |>. If you go: 'Tools -> Global Options -> Code' You will find a tick box to enable the new native pipe, if it isn't already. Then try replacing all of the old pipe operators in the code with the new one. All the best, -Robert

EnoLec commented 1 month ago

Hello,

Thank you so much for your help so far. Updating R, RStudio and the packages didn't make a difference but enabling the new native pipe did seem to work (I would have never thought about that so thank you!). However, I'm also encountering another type of error, which is very surprising.

OISST_dat <- plyr::ldply(.data = OISST_files, .fun = OISST_load, .parallel = F, lon1 = 270, lon2 = 320, lat1 = 30, lat2 = 50) Error in RNetCDF::open.nc(x) : NetCDF: Unknown file format

I checked, and I have the .nc files in the data/OISST folder as "oisst-avhrr-v02r01.20191201.nc". However, I had a go at opening this file into a GIS software, and it couldn't recognise it (I'm used to handling .nc files in both R and QGIS, which are always recognised and open straight away).

I have downloaded the file "oisst-avhrr-v02r01.20191201.nc" straight from https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/201912/ and it was recognised by QGIS. So something might have happened during the download of the .nc in the code but I can't figure out what.

This is the Traceback from the error if it can be of use:

  1. stop(meta$error)
  2. tidync.character(file_name)
  3. tidync::tidync(file_name)
  4. tidync::hyper_filter(tidync::tidync(file_name), lon = dplyr::between(lon,lon1, lon2), lat = dplyr::between(lat, lat1, lat2))
  5. tidync::hyper_tibble(tidync::hyper_filter(tidync::tidync(file_name), lon = dplyr::between(lon, lon1, lon2), lat = dplyr::between(lat, lat1, lat2)))
  6. dplyr::select(tidync::hyper_tibble(tidync::hyper_filter(tidync::tidync(file_name), lon = dplyr::between(lon, lon1, lon2), lat = dplyr::between(lat, lat1, lat2))), lon, lat, time, sst)
  7. dplyr::rename(dplyr::select(tidync::hyper_tibble(tidync::hyper_filter(tidync::tidync(file_name), lon = dplyr::between(lon, lon1, lon2), lat = dplyr::between(lat, lat1, lat2))), lon, lat, time, sst), t = time, temp = sst)
  8. dplyr::mutate(dplyr::rename(dplyr::select(tidync::hyper_tibble(tidync::hyper_filter(tidync::tidync(file_name), lon = dplyr::between(lon, lon1, lon2), lat = dplyr::between(lat, lat1, lat2))), lon, lat, time, sst), t = time, temp = sst), t = as.Date(t, origin = "1978-01-01"))
  9. FUN(X[[i]], ...)
  10. lapply(pieces, .fun, ...)
  11. structure(lapply(pieces, .fun, ...), dim = dim(pieces))
  12. llply(.data = .data, .fun = .fun, ..., .progress = .progress, .inform = .inform, .parallel = .parallel, .paropts = .paropts)
  13. plyr::ldply(.data = OISST_files, .fun = OISST_load, .parallel = F, lon1 = 270, lon2 = 320, lat1 = 30, lat2 = 50)

Any help on that would be very much appreciated Many thanks, Enora

robwschlegel commented 1 month ago

Hello, It appears that the default functionality of the tidync package has been changed enough to break backward compatibility. That's always a bummer to have to deal with. I've updated the code in the vignette and pushed the new example to the website: https://robwschlegel.github.io/heatwaveR/articles/OISST_preparation.html#load-data This should now run on your computer. All the best, -Robert

EnoLec commented 1 month ago

Hello,

Thanks for that, I'm still experiencing the same issue though. I think it comes from this section: https://robwschlegel.github.io/heatwaveR/articles/OISST_preparation.html#download-data

The .nc files downloaded are not recognised by R (or any other software dealing with .nc files) when I want to load them. It's like they are corrupted. The files are also slightly heavier when downloaded with your code (i.e. 1.,633 KB) than when downloaded from https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/201912/ (i.e. 1,627 KB for the same file). It's not a big difference but I was wondering if this could not give a clue on what is happening.

Best, Enora

robwschlegel commented 1 month ago

Hello Enora, The NOAA servers are often undergoing some sort of maintenance or temporary outages. I just checked the OISST URL now and it is currently down. Whether or not the NOAA netCDF files become corrupted is far outside of the scope of the code provided in the vignette we've been referring to. My advice is to wait another day or two and then try again. Delete some of the OISST files you've downloaded and re-download them. Or download them manually and see if that makes a difference. All the best, -Robert