xarray-contrib / xbatcher

Batch generation from xarray datasets
https://xbatcher.readthedocs.io
Apache License 2.0
167 stars 27 forks source link

Additional examples with variable numbers of input/output dimensions #157

Open maxrjones opened 1 year ago

maxrjones commented 1 year ago

What is your issue?

We should add a gallery with several examples demonstrating batch generator usage with different numbers of input and output dimensions, for example an input Dataset with 3 dimensions with an output of 2 dimensions and so on.

cmdupuis3 commented 1 year ago

I'm putting together some combined xbatcher/ML examples, although I think what you want is probably simpler. Mine are going to be end-to-end tutorials.

maxrjones commented 1 year ago

I'm putting together some combined xbatcher/ML examples, although I think what you want is probably simpler. Mine are going to be end-to-end tutorials.

That's fantastic! These end-to-end tutorials will be a really important contribution. I think they could be a great fit for the next generation of EarthML, which is being led by @jbednar's team and will be hosted on Project Pythia. Could you take a look at https://projectpythia.org/landsat-ml-cookbook and let me know your thoughts about formatting the tutorials into a similar Python cookbook? From my perspective, the main benefits of that structure rather than including the larger tutorials in the xbatcher docs are that you'll have more flexibility in dependencies (because there's no concern about bloating the environment used to build the xbatcher docs) and could use the JupyterBook configuration tools built by the Project Pythia team for more computationally expensive tutorials.

@rabernat and I spoke this AM and he seemed interested in this idea of linking the tutorials with the EarthML revamp, but still looping him into this conversation for any follow-up thoughts.

Are your end-to-end tutorials going to be centered around any particular topic areas (e.g., ML for oceanography)?

Thanks again for working on this, really excited about the end-to-end tutorials!

cmdupuis3 commented 1 year ago

Sounds reasonable, their formatting is pretty clean too. I'll have to look into it more, but for now I'm just trying to get the notebooks working lol

They're oceanography-based, but I think the focus will be more on using different methods. We have two that have been partially done for a while. One is a more-or-less vanilla CNN implementation, and another is a graph CNN. My goal with these is to demonstrate how xbatcher can be used with diverse ML workflows, so maybe we'll see PCA, clustering, or other stuff in the longer term.

jbednar commented 1 year ago

Sounds really cool; looking forward to seeing them!

rabernat commented 1 year ago

👍 to leveraging EarthML / Pythia for Chris's tutorials. That's definitely a better home for this sort of long-form content.

Chris, if you want to just get a PR started with your drafts, I think folks would be glad to provide some feedback along the way.

maxrjones commented 1 year ago

@cmdupuis3, would you be willing to share the general topic for your ML-focused cookbook on the Pythia channel of the Pangeo discourse? It would be helpful to have the Project Pythia team members aware of the idea early in the process. The Pangeo discourse is their preferred way to discuss potential cookbooks.

cmdupuis3 commented 1 year ago

@maxrjones Yeah, sounds great, I'll start a thread over there.

jsetty commented 10 months ago

@cmdupuis3 Could you please share this tutorial? I am looking for some examples of xbatcher with PyTorch.