eddelbuettel / rcppcnpy

Rcpp bindings for NumPy files
GNU General Public License v2.0
26 stars 16 forks source link

three dimensional array request #18

Closed kwythers closed 6 years ago

kwythers commented 6 years ago

What would it take to expand this package to read three dimensional .npy arrays into R. I am working with some global climate files from netcdf files that I processed into a stack of 100 yearly global means with dimensions of 720 x 360 x 100. I would love to read these 3d npy files directly into R without having to process them back into netcdfs first. Something that would take 100 of 720 x 360 global maps [720, 360, 100). Is this possible?

eddelbuettel commented 6 years ago

"Probably." Would you want to take a stab and test? I would probably start by trying a ten-line 10 program to just the file and print it, using only CNPy, and then use that result to return a 3d vector. Rcpp does not a native 'array' type -- you would stick everything into a vector, and set a dim attribute of, eg, c(720, 360, 100).

Then again, did you see the recent (2nd) vignette ? Chances are we may not need RcppCNPy.

In either case, there may be a devil lurking here with transposes so we would have to test carefully first.

kwythers commented 6 years ago

Thanks for the suggestion Dirk. That worked great. I used np$load to get a long vector, used array with your suggested ‘dim’ argument to shape. Like this:

np <- import('numpy')

data reading

(tbot_vector <- np$load(‘/path_to_file/temperature.npy')) (prectmms_vector <- np$load('/path_to_file//precipitation.npy')) (lat_vector <- np$load('/path_to_file/gswp3_climate/latitude.npy')) (lon_vector <- np$load('/path_to_file/gswp3_climate/longitude.npy'))

create a year vector for final array dimension

year_vector <- c(1901:2010)

get lons and lats into correct form (shift lon from 0:360 to -180:180)

lon_vector <- lon_vector - 180

make [360, 720, 110] array and rename rows and columns with lats and lons

tbot_array <- array(tbot_vector, dim = c(360, 720, 110), dimnames = list(lat_vector, lon_vector, year_vector))

test to return temp for twin cities, coordinates need to be in quotes

for example (44.75, -93.25, ) to return all 110 sliced of the 2d matrix

twin cities estimate

tbot_array['44.75', '-93.25', ] plot(ts(tbot_array['44.75', '-93.25', ]))

eddelbuettel commented 6 years ago

Just copying your nice example once more and setting it 'marked up' as code

np <- import('numpy')
## data reading
(tbot_vector <- np$load(‘/path_to_file/temperature.npy'))
(prectmms_vector <- np$load('/path_to_file//precipitation.npy'))
(lat_vector <- np$load('/path_to_file/gswp3_climate/latitude.npy'))
(lon_vector <- np$load('/path_to_file/gswp3_climate/longitude.npy'))

## create a year vector for final array dimension
year_vector <- c(1901:2010)

## get lons and lats into correct form (shift lon from 0:360 to -180:180)
lon_vector <- lon_vector - 180
## make [360, 720, 110] array and rename rows and columns with lats and lons
tbot_array <- array(tbot_vector,
                    dim = c(360, 720, 110),
                    dimnames = list(lat_vector,
                                    lon_vector,
                                    year_vector))
## test to return temp for twin cities, coordinates need to be in quotes
## for example (44.75, -93.25, ) to return all 110 sliced of the 2d matrix

## twin cities estimate
tbot_array['44.75', '-93.25', ]
plot(ts(tbot_array['44.75', '-93.25', ]))

With that, I think it is ok to close the feature request.