xarray-contrib / xarray-simlab

Xarray extension and framework for computer model simulations
http://xarray-simlab.readthedocs.io
BSD 3-Clause "New" or "Revised" License
73 stars 9 forks source link

Handling values provided by xarray.Dataset for scalar variables inside processes #15

Closed benbovy closed 6 years ago

benbovy commented 7 years ago

in xarray objects all data variables are arrays, even for scalars, e.g.,

>>> import xarray as xr
>>> xr.DataArray(1).values
array(1)

As shown in the small example below, if a value is provided by an input xarray Dataset for a scalar variable declared in a process, it will be a 0-dim array.

import xsimlab

class Proc(xsimlab.Process):
    var = xsimlab.Variable(())

    def run_step(self, dt):        
        print(type(self.var.value))

m = xsimlab.Model({'proc': Proc})

in_ds = xsimlab.create_setup(
    model=m, 
    clocks={'time': {'data': [0, 1]}},
    input_vars={'proc': {'var': 1}}
)
>>> _ = in_ds.xsimlab.run(model=m)
<class 'numpy.ndarray'>

In some cases this might be an issue. For example I noted significant performance degradation when using 0-dim numpy arrays instead of scalars in numba (0.34.0) compiled functions:

import numba 
import numpy as np

@numba.njit
def add(a, b):
    return a + b
In [3] : %timeit add(np.array(2), np.array(3))
1.43 µs ± 20.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [4] : %timeit add(2, 3)
174 ns ± 7.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Two options: