ropensci / tidync

NetCDF exploration and data extraction
https://docs.ropensci.org/tidync
90 stars 12 forks source link

tidync not opening .nc file (yet ncdf4 is) #98

Closed everydayduffy closed 2 years ago

everydayduffy commented 5 years ago
Session Info ```r - Session info ------------------------------------------------------------------------------------------------------------------------- setting value version R version 3.5.2 (2018-12-20) os Windows 10 x64 system x86_64, mingw32 ui RStudio language (EN) collate English_United Kingdom.1252 ctype English_United Kingdom.1252 tz Europe/London date 2019-10-21 - Packages ----------------------------------------------------------------------------------------------------------------------------- package * version date lib source assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.5.3) backports 1.1.5 2019-10-02 [1] CRAN (R 3.5.3) callr 3.3.2 2019-09-22 [1] CRAN (R 3.5.3) cli 1.1.0 2019-03-19 [1] CRAN (R 3.5.3) crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.2) desc 1.2.0 2018-05-01 [1] CRAN (R 3.5.2) devtools 2.2.1 2019-09-24 [1] CRAN (R 3.5.3) digest 0.6.21 2019-09-20 [1] CRAN (R 3.5.3) dplyr 0.8.3 2019-07-04 [1] CRAN (R 3.5.3) ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.5.3) forcats 0.4.0 2019-02-17 [1] CRAN (R 3.5.2) fs 1.3.1 2019-05-06 [1] CRAN (R 3.5.3) glue 1.3.1 2019-03-12 [1] CRAN (R 3.5.3) magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.2) memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.2) ncdf4 1.16.1 2019-03-11 [1] CRAN (R 3.5.3) ncmeta 0.1.0 2019-08-28 [1] CRAN (R 3.5.3) pillar 1.4.2 2019-06-29 [1] CRAN (R 3.5.3) pkgbuild 1.0.5 2019-08-26 [1] CRAN (R 3.5.3) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.5.3) pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.5.2) prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.5.2) processx 3.4.1 2019-07-18 [1] CRAN (R 3.5.2) ps 1.3.0 2018-12-21 [1] CRAN (R 3.5.2) purrr 0.3.2 2019-03-15 [1] CRAN (R 3.5.3) R6 2.4.0 2019-02-14 [1] CRAN (R 3.5.2) Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.5.3) remotes 2.1.0 2019-06-24 [1] CRAN (R 3.5.3) rlang 0.4.0 2019-06-25 [1] CRAN (R 3.5.3) RNetCDF 2.1-1 2019-10-20 [1] CRAN (R 3.5.3) rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.5.2) rstudioapi 0.10 2019-03-19 [1] CRAN (R 3.5.3) sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.2) testthat 2.2.1 2019-07-25 [1] CRAN (R 3.5.3) tibble 2.1.3 2019-06-06 [1] CRAN (R 3.5.3) tidync * 0.2.1 2019-05-23 [1] CRAN (R 3.5.3) tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.5.2) usethis 1.5.1 2019-07-04 [1] CRAN (R 3.5.3) withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.2) [1] C:/R/R352/library ```

Hello,

It's hard to produce a reprex with this issue - as I think it might be a filesize limit... I have a ~16 GB .nc file, that opens fine with ncdf4::nc_open:

ncdf4::nc_open("C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc")

File C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc (NC_FORMAT_64BIT):

     11 variables (excluding dimension variables):
        short t2m[longitude,latitude,time]   
            scale_factor: 0.0018459870330567
            add_offset: 259.699722453261
            _FillValue: -32767
            missing_value: -32767
            units: K
            long_name: 2 metre temperature
        short d2m[longitude,latitude,time]   
            scale_factor: 0.00165447156930955
            add_offset: 249.775875645075
            _FillValue: -32767
            missing_value: -32767
            units: K
            long_name: 2 metre dewpoint temperature
        short sp[longitude,latitude,time]   
            scale_factor: 0.858999175510811
            add_offset: 76591.6232347872
            _FillValue: -32767
            missing_value: -32767
            units: Pa
            long_name: Surface pressure
            standard_name: surface_air_pressure
        short u10[longitude,latitude,time]   
            scale_factor: 0.00137045268015017
            add_offset: 11.5694746857693
            _FillValue: -32767
            missing_value: -32767
            units: m s**-1
            long_name: 10 metre U wind component
        short v10[longitude,latitude,time]   
            scale_factor: 0.000908306702124907
            add_offset: -0.164554800323719
            _FillValue: -32767
            missing_value: -32767
            units: m s**-1
            long_name: 10 metre V wind component
        short tp[longitude,latitude,time]   
            scale_factor: 1.03291312208937e-06
            add_offset: 0.0338444313583804
            _FillValue: -32767
            missing_value: -32767
            units: m
            long_name: Total precipitation
        short tcc[longitude,latitude,time]   
            scale_factor: 1.52594875864068e-05
            add_offset: 0.499992370256207
            _FillValue: -32767
            missing_value: -32767
            units: (0 - 1)
            long_name: Total cloud cover
            standard_name: cloud_area_fraction
        short msnlwrf[longitude,latitude,time]   
            scale_factor: 0.00657968167635962
            add_offset: -152.479241989276
            _FillValue: -32767
            missing_value: -32767
            units: W m**-2
            long_name: Mean surface net long-wave radiation flux
        short msdwlwrf[longitude,latitude,time]   
            scale_factor: 0.00681008047338149
            add_offset: 276.094037586716
            _FillValue: -32767
            missing_value: -32767
            units: W m**-2
            long_name: Mean surface downward long-wave radiation flux
        short fdir[longitude,latitude,time]   
            scale_factor: 58.4499412509728
            add_offset: 1915170.77502937
            _FillValue: -32767
            missing_value: -32767
            units: J m**-2
            long_name: Total sky direct solar radiation at surface
        short ssrd[longitude,latitude,time]   
            scale_factor: 65.309646541308
            add_offset: 2139934.81178096
            _FillValue: -32767
            missing_value: -32767
            units: J m**-2
            long_name: Surface solar radiation downwards
            standard_name: surface_downwelling_shortwave_flux_in_air

     3 dimensions:
        longitude  Size:1440
            units: degrees_east
            long_name: longitude
        latitude  Size:721
            units: degrees_north
            long_name: latitude
        time  Size:720
            units: hours since 1900-01-01 00:00:00.0
            long_name: time
            calendar: gregorian

    2 global attributes:
        Conventions: CF-1.6
        history: 2019-10-19 22:19:09 GMT by grib_to_netcdf-2.14.0: /opt/ecmwf/eccodes/bin/grib_to_netcdf -o /cache/data5/adaptor.mars.internal-1571522811.8826406-23089-23-b6cb5e5c-0526-4387-95b3-2c54be9721a0.nc /cache/tmp/b6cb5e5c-0526-4387-95b3-2c54be9721a0-adaptor.mars.internal-1571522811.8841808-23089-8-tmp.grib

But when I use tidync::tidync(), I get the following error:

tidync::tidync("C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc")

Error in nc_meta.character(...) : 
  failed to open 'x', value given was: "C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc"
In addition: Warning message:
In tidync.character("C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc") :
  Oops, connection to source failed.
 C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc

I have much smaller .nc files from the same source (it's ERA5 climate data from the Climate Data Store that open successfully with both packages. Is there something about the rather large size of the .nc file that tidync isn't happy with?

mdsumner commented 5 years ago

There shouldn't be any file size limitations - I don't know. Can you try opening with RNetCDF? It's been updated recently on CRAN, and that might be enough. It's now at version 2.1-1

nc <-  RNetCDF::open.nc("C:/Dropbox/Sandbox/era5_globe_april_1990_1990.nc")
RNetCDF::print.nc(nc)

If that also doesn't work, can you report your version of RNetCDF, and try updating it?

If you can tell me where to download the file I'll explore myself. Thanks ;)

mdsumner commented 5 years ago

Oh sorry, I see your session info - you are at latest RNetCDF.

If there's any way I can get access to your file I'd pursue it.

everydayduffy commented 5 years ago

Thanks for offering to look at it. Have sent you a Dropbox download link.

mdsumner commented 5 years ago

Cool thanks for sharing the file, I don't find any problems (either CRAN or github tidync).

f <- "/mnt/mdsumner/tidync_98/era5_globe_april_1990_1990.nc"
x <- tidync::tidync(f) 
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C               LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8    LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2        tidyr_1.0.0       zeallot_0.1.0     packrat_0.5.0     crayon_1.3.4      dplyr_0.8.3      
 [7] assertthat_0.2.1  R6_2.4.0          lifecycle_0.1.0   backports_1.1.5   magrittr_1.5      ncdf4_1.16.1     
[13] pillar_1.4.2      rlang_0.4.0       ncmeta_0.1.0      rstudioapi_0.10   vctrs_0.2.0       forcats_0.4.0    
[19] tools_3.6.1       glue_1.3.1        purrr_0.3.3       RNetCDF_2.1-1     parallel_3.6.1    compiler_3.6.1   
[25] pkgconfig_2.0.3   tidyselect_0.2.5  tidync_0.2.1.9002 tibble_2.1.3     

But, as a long shot - maybe the Dropbox client is causing problems for you? Maybe try moving the file out of Dropbox itselft and read then. Other than that I don't have any suggestions, but I will try on Windows when I can.

mdsumner commented 5 years ago

Another possibility, try with the Github version of ncmeta:

devtools::install_github("hypertidy/ncmeta")

The file connection is wrapped in a safely() call, and I realize that might be masking useful error messages so I'll try to unpack that bit in future.

everydayduffy commented 5 years ago

Thanks for taking a look. Moving outside of Dropbox didn't work, and I have just tried with the Github version of ncmeta, but still no joy.

mdsumner commented 5 years ago

Thanks for the follow up, I'll try on Windows - I actually have to push tidync out to CRAN because of failing checks, so I need to fix this quickly if possible

everydayduffy commented 5 years ago

Ok - well I'm happy to help if I can. Would you like me to send you a file obtained from the same source (but with smaller dimensions) that works fine for me?

mdsumner commented 5 years ago

Confirmed, it's a problem in RNetCDF on Windows (x64):

> f
[1] "C:/mds/era5_globe_april_1990_1990.nc"
> library(RNetCDF)
> nc <- open.nc(f)
Error in open.nc(f) : NetCDF: Numeric conversion not representable
>

but, tidync is wrapping it in a way that means we don't see the real message.

What I'm unclear on is if a new from-source build of RNetCDF would fix it. The rwinlib was recently updated, but I'm not sure if the CRAN version was also updated.

Also, I don't know what exactly causes the problem. @mjwoods have you seen this?

mjwoods commented 5 years ago

Hi @mdsumner , I'll have access to a Windows box later this week, so I could test the file myself then. Do you mind sending me a Dropbox link, @everydayduffy ?

Also, have either of you tried opening the file on a Linux machine? That would help me to determine if the problem is caused by RNetCDF itself or the Windows libraries it uses.

mdsumner commented 5 years ago

Definitely works fine on Linux, I can follow up with system details 👍

everydayduffy commented 5 years ago

Dropbox link sent to @mjwoods. Thank you both for looking into this.

everydayduffy commented 5 years ago

Hi @mjwoods ,

Did you get my .nc file ok? Wondered if you had time to check on a Windows machine?

mjwoods commented 5 years ago

Hi @everydayduffy , I did manage to download your file, thanks. I tested it on Windows with several different builds of netcdf. Some worked, but not the versions we need for R packages. I think that means the problem lies in the netcdf library, and not in RNetCDF itself. I'll continue to investigate until I run out of ideas. It's a spare-time project for me, so it may take a while to sort out.

everydayduffy commented 5 years ago

Hi @mjwoods. Thanks for the update and for spending some time investigating. Much appreciated!

everydayduffy commented 4 years ago

Hi @mjwoods. I had an idea today which worked... I read the troublesome .nc file into python with the xarray package, and wrote it out again. The new file can now be read by tidync in R. It's a workaround which allows me to work with the data on Windows, but doesn't solve the RNetCDF problem! Thought I would let you know in case it's of interest.

mjwoods commented 4 years ago

Hi @everydayduffy , which version of python are you using? Is it on Windows? If so, maybe we can learn something from the way they build their netcdf library.

everydayduffy commented 4 years ago

Hi @mjwoods, am using the following: Python 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]

On windows 10 64bit and installed with Anaconda.

I have been converting the .nc files in R with the reticulate package, using the following code:

nc_file <- "old_nc.nc"
nc_file_c <- "new_nc.nc"

reticulate::use_condaenv("C:/anaconda/envs/R_env") # conda environment with numpy and xarray installed. 
xr <- reticulate::import("xarray")
DS = xr$open_dataset(nc_file)
DS$to_netcdf(nc_file_c)
mjwoods commented 4 years ago

I looked at the build recipes for netcdf4 in anaconda, and one thing I notice is that they use the Microsoft compilers. We have to use the build system used by R, which is based on gcc. This may have something to do with our problems. I don't think gcc itself is at fault, but maybe the netcdf source code relies on special features of the Microsoft compilers on Windows. I'm just guessing for now.

mjwoods commented 4 years ago

I tested the latest Windows builds of RNetCDF (2.1-1) and ncdf4 (1.17), and both have the same problem with the file from @everydayduffy .

> RNetCDF::open.nc("era5_globe_april_1990_1990.nc")
Error in RNetCDF::open.nc("era5_globe_april_1990_1990.nc") : 
  NetCDF: Numeric conversion not representable
> ncdf4::nc_open("era5_globe_april_1990_1990.nc")
Error in R_nc4_open: NetCDF: Numeric conversion not representable
Error in ncdf4::nc_open("era5_globe_april_1990_1990.nc") : 
  Error in nc_open trying to open file era5_globe_april_1990_1990.nc

ncdf4-1.17 uses prebuilt libraries from https://github.com/rwinlib/netcdf … and so does RNetCDF-2.1-1. Interestingly, the previous ncdf4-1.16 used older netcdf libraries from http://win-builder.r-project.org, so I tried building RNetCDF the same way. The file "era5_globe_april_1990_1990.nc" was opened successfully!

Unfortunately, the older libraries from winbuilder do not support OpenDAP, so switching back to them would reduce the functionality available for other users.

@everydayduffy - until we can solve this problem properly, I could build a special version of RNetCDF for you using the winbuilder libraries. Please let me know if you would like this, because winbuilder only keeps builds for 3 days.

everydayduffy commented 4 years ago

Hi @mjwoods,

Thanks again for further investigating this. Building a bespoke version would be very much appreciated. Myself and a few colleagues would be very grateful. Would installation of this version be relatively straightforward?

mjwoods commented 4 years ago

Hi @everydayduffy , the new package is ready at https://win-builder.r-project.org/Q9lZgg3e7K0z/ . The zip file contains the binary files for Windows. The package will be removed automatically within 72 hours, so please download ASAP.

Installation should be possible from within R (on Windows) using the command install.packages("RNetCDF_2.2-1.zip"), adding your download directory if necessary.

Good luck!

everydayduffy commented 4 years ago

Great - thank you so much @mjwoods. I have it installed, and now RNetCDF::open.nc isn't giving the Numeric conversion not representable error (which is a good sign). However, functions from tidync are still throwing this error. Is tidync somehow using the "broken" version of RNetCDF. Can I get it to call upon the new one you have made me?

mjwoods commented 4 years ago

That's strange - tidync worked for me. I am using R-3.6.2 (64 bit) on Windows 10 with the latest tidync (and dependencies), plus RNetCDF_2.2-1 from win-builder. I tried opening the 8.4GB file you sent me earlier on Dropbox:

tidync::tidync("D:/milto/Documents/era5_globe_april_1990_1990.nc")

Data Source (1): era5_globe_april_1990_1990.nc ...

Grids (4) <dimension family> : <associated variables> 

[1]   D0,D1,D2 : t2m, d2m, sp, u10, v10, tp, tcc, msnlwrf, msdwlwrf, fdir, ssrd    **ACTIVE GRID** ( 747532800  values per variable)
[2]   D0       : longitude
[3]   D1       : latitude
[4]   D2       : time

Dimensions 3 (all active): 

  dim   name     length    min     max start count   dmin   dmax unlim coord_dim 
  <chr> <chr>     <dbl>  <dbl>   <dbl> <int> <int>  <dbl>  <dbl> <lgl> <lgl>     
1 D0    longitu~   1440   -180    180.     1  1440   -180 1.80e2 FALSE TRUE      
2 D1    latitude    721    -90     90      1   721    -90 9.00e1 FALSE TRUE      
3 D2    time        720 791088 791807      1   720 791088 7.92e5 FALSE TRUE    

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3       tidyr_1.0.0      fansi_0.4.0      utf8_1.1.4      
 [5] zeallot_0.1.0    crayon_1.3.4     dplyr_0.8.3      assertthat_0.2.1
 [9] R6_2.4.1         lifecycle_0.1.0  backports_1.1.5  magrittr_1.5    
[13] ncdf4_1.17       pillar_1.4.3     cli_2.0.0        rlang_0.4.2     
[17] ncmeta_0.2.0     vctrs_0.2.1      tools_3.6.2      forcats_0.4.0   
[21] glue_1.3.1       purrr_0.3.3      RNetCDF_2.2-1    compiler_3.6.2  
[25] pkgconfig_2.0.3  tidyselect_0.2.5 tidync_0.2.3     tibble_2.1.3    

I am not familiar with tidync, so please let me know if there are any other tests I should try.

Perhaps it would be a good idea to use update.packages to ensure that the latest packages are being used. You may need to reinstall RNetCDF_2.2-1.zip as well.

everydayduffy commented 4 years ago

Ah ok, now i've dissected the workflow a bit more, I can see the problem is with hyper_tibble. tidync::tidync() on its own works fine now... (the file i'm using is of similar size and contents to the one I provided you).

library(tidync)

f <- "C:/PROCESSING/raw_data/era5_surface_global_2008_02.nc"

# works
tidync(f)

# breaks on hyper_tibble
b <- tidync(f) %>%
  tidync::hyper_filter(longitude = longitude == 5,
                       latitude = latitude == 5) %>%
  hyper_tibble()

Here's my session info FYI:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tidync_0.2.3  RNetCDF_2.2-1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3       rstudioapi_0.10  magrittr_1.5     tidyselect_0.2.5 R6_2.4.1         rlang_0.4.2      fansi_0.4.0      dplyr_0.8.3      tools_3.6.1     
[10] utf8_1.1.4       ncmeta_0.2.0     cli_2.0.0        assertthat_0.2.1 tibble_2.1.3     lifecycle_0.1.0  crayon_1.3.4     purrr_0.3.3      tidyr_1.0.0     
[19] vctrs_0.2.1      ncdf4_1.17       zeallot_0.1.0    glue_1.3.1       compiler_3.6.1   pillar_1.4.3     forcats_0.4.0    backports_1.1.5  pkgconfig_2.0.3 
mdsumner commented 4 years ago

What's the error?

everydayduffy commented 4 years ago
Error in R_nc4_open: NetCDF: Numeric conversion not representable
Error in ncdf4::nc_open(x$source$source[1]) : 
  Error in nc_open trying to open file C:/PROCESSING/raw_data/era5_surface_global_2008_02.nc
mjwoods commented 4 years ago

The latest version of ncdf4 uses the same netcdf libraries as RNetCDF on Windows. Unfortunately I can’t find any older ncdf4 versions on CRAN that are compatible with R-3.6. As a workaround, I have rebuilt ncdf4 with the old CRAN netcdf library. The package is available at https://win-builder.r-project.org/E4P6xR8nZFu6/ . This may solve your problem, but I haven’t been able to test it yet.

mjwoods commented 4 years ago

Hi @everydayduffy , I just tested the custom ncdf4 build with hyper_tibble, and it seems to work:

> library(tidync)
> f <- "era5_globe_april_1990_1990.nc"
> b <- tidync(f) %>%
+   tidync::hyper_filter(longitude = longitude == 5,
+                        latitude = latitude == 5) %>%
+   hyper_tibble()
> b
# A tibble: 720 x 14
     t2m   d2m     sp   u10   v10      tp   tcc msnlwrf msdwlwrf   fdir   ssrd
   <dbl> <dbl>  <dbl> <dbl> <dbl>   <dbl> <dbl>   <dbl>    <dbl>  <dbl>  <dbl>
 1  301.  297. 1.01e5  2.49  3.53 0.      1       -152.     276. 1.92e6 2.14e6
 2  301.  297. 1.01e5  2.75  4.05 1.03e-6 1       -152.     276. 1.92e6 2.14e6
 3  301.  297. 1.01e5  2.95  4.60 2.07e-6 1       -152.     276. 1.92e6 2.14e6
 4  301.  297. 1.01e5  3.25  4.57 8.26e-6 1       -152.     276. 1.92e6 2.14e6
 5  301.  297. 1.01e5  3.34  3.84 2.48e-5 1       -152.     276. 1.92e6 2.14e6
 6  301.  297. 1.01e5  3.35  3.60 2.58e-5 1       -152.     276. 1.92e6 2.14e6
 7  301.  297. 1.01e5  3.62  4.18 1.96e-5 0.995   -152.     276. 1.92e6 2.14e6
 8  301.  297. 1.01e5  3.98  4.68 1.03e-5 0.917   -152.     276. 1.92e6 2.14e6
 9  301.  297. 1.01e5  3.63  4.63 1.03e-5 0.835   -152.     276. 1.92e6 2.14e6
10  302.  298. 1.01e5  2.80  4.36 8.26e-6 0.882   -152.     276. 1.92e6 2.14e6
# ... with 710 more rows, and 3 more variables: longitude <dbl>,
#   latitude <dbl>, time <dbl>

The ncdf4_1.17-2.zip package will be automatically removed within a few days, but I can have it rebuilt if needed.

everydayduffy commented 4 years ago

Thanks @mjwoods. I have downloaded the modified ncdf4 pacakge. Have got a lot of scripts running at the moment, so I'll have to wait a while before I can test, but pleased to see that hyper_tibble() was working for you. I really appreciate the time you've spent helping me - thank you. I don't want to take much more of your time, so looking forward, what do you think the longer term solution (if any) to this issue is?

mjwoods commented 4 years ago

Hi @everydayduffy , I'm pleased to see RNetCDF being used for serious research, and it's the least I can do to make sure it works!

In the longer term, the new msys2-based toolchain being used for R should provide up-to-date versions of the netcdf library, which may eventually fix the bugs that we are seeing. In fact, I recently built RNetCDF and ncdf4 with the new toolchain and the included netcdf library, and the above tests with hyper_tibble actually worked! The problem is that opendap did not work at all (even using the version of the netcdf library with opendap enabled). I'll try contacting the netcdf developers to ask for advice.

am2222 commented 4 years ago

@mjwoods Hi, I have checked this file https://win-builder.r-project.org/E4P6xR8nZFu6/ but it is not available anymore. I have the same issue with my .nc file. Can you please reupload it ?

mjwoods commented 4 years ago

Hi @am2222 , are you using the latest R and RNetCDF for Windows? I was expecting these versions to fix the problem. If your file is not too large, please attach it to this issue so that I can have a closer look.

RichardBean commented 4 years ago

Hi again, I still have this issue with ERA5 data where the file size is >4 GB. I suppose I will have to just download the data again into smaller files ...

open.nc(flist[h]) Error in open.nc(flist[h]) : NetCDF: Numeric conversion not representable

sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] lubridate_1.7.9 RNetCDF_2.4-2

loaded via a namespace (and not attached): [1] compiler_4.0.3 generics_0.0.2 tools_4.0.3 Rcpp_1.0.5

everydayduffy commented 4 years ago

@RichardBean - see below: one workaround I found... can be done with reticulate if you want to keep it all in R.

Hi @mjwoods. I had an idea today which worked... I read the troublesome .nc file into python with the xarray package, and wrote it out again. The new file can now be read by tidync in R. It's a workaround which allows me to work with the data on Windows, but doesn't solve the RNetCDF problem! Thought I would let you know in case it's of interest.

acrunyon commented 2 years ago

I am having this same problem. I have been using the same script to open the NOAA nClimGrid nc files (https://www.ncei.noaa.gov/thredds/catalog/data-in-development/nclimgrid/catalog.html) for months using tidync().

As of this morning I get the following error (yet the file opens just fine with ncdf4). I have updated R and all packages and have tried using different netcdf files of different datasets, all with same result. Did something change in the package or its dependencies?

Error: Tibble columns must have compatible sizes. • Size 2: Columns filter_id and filter_params. • Size 3: Column chunksizes. ℹ Only values of size one are recycled.

acrunyon commented 2 years ago

The problem appeared to be with ncmeta. Installing the dev version of the package resolves this issue. https://github.com/hypertidy/ncmeta/issues/42

mdsumner commented 2 years ago

I'm prepping arelease for ncmeta which should fix, sorry for the confusion and delay.

@acrunyon can I please ask for an actual url to a netcdf that you use with 'tidync()' - that catalog pages has several possible links ending in '.nc' and I can never remember which one is supposed to work, ty

acrunyon commented 2 years ago

https://www.ncei.noaa.gov/thredds/catalog/data-in-development/nclimgrid/catalog.html?dataset=data-in-development/nclimgrid/nclimgrid_prcp.nc

mdsumner commented 2 years ago

that's the catalog page I mentioned, I can't reproduce with that - please reprex when reporting on issues

ncdump -h https://www.ncei.noaa.gov/thredds/catalog/data-in-development/nclimgrid/catalog.html?dataset=data-in-development/nclimgrid/nclimgrid_prcp.nc 

 NetCDF: Malformed or unexpected Constraint
acrunyon commented 2 years ago

Sorry wrong link. Try this one: https://www.ncei.noaa.gov/thredds/fileServer/data-in-development/nclimgrid/nclimgrid_prcp.nc

mdsumner commented 2 years ago

that still doesn't work for me, what does work is the link from the OpenDAP page i.e.

ncdump -h https://www.ncei.noaa.gov/thredds/dodsC/data-in-development/nclimgrid/nclimgrid_prcp.nc

that's the bit in "Data URL:" here https://www.ncei.noaa.gov/thredds/dodsC/data-in-development/nclimgrid/nclimgrid_prcp.nc.html found by following OpenDAP (the top link in the catalog page).

mdsumner commented 2 years ago

there's a few subtopics here, if anything remains please open a new issue.

ncmeta is now updated on CRAN at 0.3.5, fixing the problem with tibble compatible sizes.