team-ocean / veros

The versatile ocean simulator, in pure Python, powered by JAX.
https://veros.readthedocs.io
MIT License
333 stars 55 forks source link

xarray integration #25

Closed dionhaefner closed 3 years ago

benbovy commented 5 years ago

Hi @dionhaefner,

@jeanbraun just pointed me to Veros and more specifically told me about its (still planned?) xarray integration. I think he had seen your presentation at EGU last week (I missed EGU this year unfortunately). Veros looks really nice!

FWIW, I'd like to mention here xarray-simlab.

It seems that the goals of the two projects are quite orthogonal. Xarray-simlab's goal is to provide a lightweight, domain-agnostic framework for writing computational models in a modular fashion. It also provides handy features for model introspection, I/O (based on xarray), seamless integration with the broader scientific Python ecosystem... and more to come.

I think it it should be possible to wrap parts of a Veros model setup as xarray-simlab model components (i.e., processes). This way you would have a xarray highly compliant interface for free. That said, I haven't had a deep look at Veros yet so I don't know if the two projects would integrate so easily. As the main author of xarray-simlab I'm also probably a bit biased!

Anyway, I'd be happy to further discuss on this with you if you'd like. I'm very interested in how xarray-simlab could be leveraged by tools like Veros and on what could be improved on the xarray-simlab side towards better integration with those tools.

dionhaefner commented 5 years ago

👋 @benbovy, thanks for stopping by!

I just had a brief look at xarray-simlab, and the interface definitely looks like something we could use. However, I'm a bit concerned whether it is flexible enough to support what we are doing. How would you see MPI parallelization happening? We need to be able to operate on chunks and communicate overlaps.

In case xarray-simlab supports everything we need, it would still be a lot of work to port Veros over (more than I alone can put in), so I wouldn't expect this to happen anytime soon. But I really like the abstraction you have developed, so it might actually pay off to do it when we have more capacities.

benbovy commented 5 years ago

Thanks for your reply @dionhaefner!

How would you see MPI parallelization happening? We need to be able to operate on chunks and communicate overlaps.

The modelling framework built in xarray-simlab can be viewed as a lightweight object-oriented layer built on top of a key-value simulation data store. There is no restriction on the types of those values, except for values used as model input/output, which, when using the framework with the xarray interface, must be something that can be wrapped by a xarray.Variable (numpy arrays are supported of course, and I guess that Bohrium arrays could be easily coerced into numpy arrays?). So in summary, I don't think that xarray-simlab would interfere with the MPI-based parallelization capabilities in Veros.

The purpose of the object-oriented layer mentioned above is twofold:

it would still be a lot of work to port Veros over (more than I alone can put in)

Yes indeed! If I find the time, I'd like if to try hacking something like writing a Veros model setup using xarray-simlab and see how it works. I'll keep you updated.

dionhaefner commented 5 years ago

Most of the current abstraction in Veros is either inherited from PyOM2 (Fortran) or some hackish Python to provide some sugar (veros_method, Variable, the Veros class). I have been wanting to build a real "smart" interface for a while, so I'm grateful for the input. I think my wishlist would be

It seems like xarray-simlab ticks all of those boxes, but I'm a bit worried about the user experience. Veros has hundreds of settings and variables, so I think a class-based approach where you only override those you care about has its merits.

I guess that Bohrium arrays could be easily coerced into numpy arrays

The last time I checked, I could put Bohrium's bh.ndarray into a xarray.DataArray just fine, so this should actually work out of the box.