pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.49k stars 1.04k forks source link

add backend intro and how-to diagram #9175

Open JessicaS11 opened 1 week ago

JessicaS11 commented 1 week ago
keewis commented 1 week ago

be aware that currently this also fails on main – a dependency stopped pinning to numpy<2.0. If you don't want to spend your time fixing the numpy 2 issues, feel free to pin numpy<2.0 in the docs environment (but otherwise I'd appreciate the help!)

max-sixty commented 6 days ago

Are we OK to merge this? The content looks good! The errors are unrelated.

@JessicaS11 does the mermaid diagram render OK? Let's merge away if so...

JessicaS11 commented 6 days ago

be aware that currently this also fails on main – a dependency stopped pinning to numpy<2.0. If you don't want to spend your time fixing the numpy 2 issues, feel free to pin numpy<2.0 in the docs environment (but otherwise I'd appreciate the help!)

I was hoping this would allow me to see how it rendered (it's got some funny spacing in the mermaid.live version, screenshot below) with an added bonus of addressing an easy numpy 2.0 failure, but alas... happy to pin the environment if that's the current plan for handling numpy>=2.0.

Are we OK to merge this? The content looks good! The errors are unrelated. As noted above, the spacing is a bit funny, so I'm going to try and make it more explicit so it will hopefully render more cleanly.

image

keewis commented 6 days ago

if that's the current plan for handling numpy>=2.0.

after writing this I've opened #9177 to deal with most of these issues. I didn't notice that RTD is still failing, though.

max-sixty commented 6 days ago

OK!

it's got some funny spacing in the mermaid.live version, screenshot below

Possibly not a blocker to merge, but... is this only in the preview rather than the version that would deploy? I do think it's somewhat difficult to read — e.g. empty lines at the top of each block, line-breaks carrying through from the arbitrary line-breaks in the .rst file, the code isn't monospaced, lists are centered.

Do we need the warranty on the list not being inclusive ("exhaustive"?) — the "No" condition seems to cover that it's not exhaustive by suggesting to ask around?

I really liked the diagram in the previous PR, I thought it was a great use of a diagram. Maaaaaybe for a much simpler decision tree of if / else: if / else text is sufficient... (still keen to merge though)

JessicaS11 commented 5 days ago

Possibly not a blocker to merge, but... is this only in the preview rather than the version that would deploy? I do think it's somewhat difficult to read — e.g. empty lines at the top of each block, line-breaks carrying through from the arbitrary line-breaks in the .rst file, the code isn't monospaced, lists are centered.

I did some work to try and improve the rendering (there's no preview for how it would deploy because RTD build is failing, so all I can go on is how it renders in the live tool, which is where the screen shot is from). The empty lines at the top appear when you add the `" notation to make the nodes render as markdown, which is required for the italics and bold. I cannot find any record or setting that makes the extra space at the top go away.

Do we need the warranty on the list not being inclusive ("exhaustive"?) — the "No" condition seems to cover that it's not exhaustive by suggesting to ask around?

Good point - removed to streamline.

I really liked the diagram in the previous PR, I thought it was a great use of a diagram. Maaaaaybe for a much simpler decision tree of if / else: if / else text is sufficient... (still keen to merge though)

This came out of some conversation with @scottyhq @TomNicholas @negin513 @betolink during planning for the upcoming SciPy tutorial. We also considered putting it into the tutorial book and decided it might be a good "intro" to this section of the docs. Given how text-rich much of the Xarray docs are, I personally am a fan of anything that conveys info in a more visually interesting way (I like the idea of some type of if/else callout that finds a happy medium between more complex graphics and wall of text).

Updated screenshot: image

max-sixty commented 5 days ago

Nice, that does look better, thanks

How about this as a few small changes? Feel free to take anything (but no obligation), and then let's merge

flowchart LR
    built-in-eng["""Is your data stored in one of these formats?
        - netCDF4 (<code>netcdf4</code>)
        - netCDF3 (<code>scipy</code>)
        - Zarr (<code>zarr</code>)
        - DODS/OPeNDAP (<code>pydap</code>)
        - HDF5 (<code>h5netcdf</code>)
        """]

    built-in["""You're in luck! Xarray bundles a backend for this format.
        Open data using <code>xr.open_dataset()</code>. We recommend
        always setting the engine you want to use."""]

    installed-eng["""One of these formats?
        - GRIB (<code>cfgrib</code>)
        - TileDB (<code>tiledb</code>)
        - GeoTIFF, JPEG-2000, ESRI-hdf (<code>rioxarray</code>, via GDAL)
        - Sentinel-1 SAFE (<code>xarray-sentinel</code>)
        """]

    installed["""Install the package indicated in parentheses
        to your Python environment. Restart the kernel
        and use <code>xr.open_dataset(files, engine='rioxarray')</code>"""]

    other["""Ask around to see if someone in your data community
        has created an Xarray backend for your data type.
        If not, you may need to create your own or consider
        exporting your data to a more common format."""]

    built-in-eng -->|Yes| built-in
    built-in-eng -->|No| installed-eng

    installed-eng -->|Yes| installed
    installed-eng -->|No| other

    click built-in-eng "https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#how-do-i-open-format-x-file-as-an-xarray-dataset"
    click installed-eng "https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray"
    click other "https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html"

    classDef quesNodefmt fill:#9DEEF4,stroke:#206C89,text-align:left
    class built-in-eng,installed-eng quesNodefmt

    classDef ansNodefmt fill:#FFAA05,stroke:#E37F17,text-align:left,white-space:nowrap
    class built-in,installed,other ansNodefmt

    linkStyle default font-size:20pt,color:#206C89

(edit: I notice GH renders this if we supply mermaid as a code tag! So also pasting as plain text, and a screenshot, since GH doesn't render the monospace)

flowchart LR
    built-in-eng["""Is your data stored in one of these formats?
        - netCDF4 (<code>netcdf4</code>)
        - netCDF3 (<code>scipy</code>)
        - Zarr (<code>zarr</code>)
        - DODS/OPeNDAP (<code>pydap</code>)
        - HDF5 (<code>h5netcdf</code>)
        """]

    built-in["""You're in luck! Xarray bundles a backend for this format.
        Open data using <code>xr.open_dataset()</code>. We recommend
        always setting the engine you want to use."""]

    installed-eng["""One of these formats?
        - GRIB (<code>cfgrib</code>)
        - TileDB (<code>tiledb</code>)
        - GeoTIFF, JPEG-2000, ESRI-hdf (<code>rioxarray</code>, via GDAL)
        - Sentinel-1 SAFE (<code>xarray-sentinel</code>)
        """]

    installed["""Install the package indicated in parentheses
        to your Python environment. Restart the kernel
        and use <code>xr.open_dataset(files, engine='rioxarray')</code>"""]

    other["""Ask around to see if someone in your data community
        has created an Xarray backend for your data type.
        If not, you may need to create your own or consider
        exporting your data to a more common format."""]

    built-in-eng -->|Yes| built-in
    built-in-eng -->|No| installed-eng

    installed-eng -->|Yes| installed
    installed-eng -->|No| other

    click built-in-eng "https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#how-do-i-open-format-x-file-as-an-xarray-dataset"
    click installed-eng "https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray"
    click other "https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html"

    classDef quesNodefmt fill:#9DEEF4,stroke:#206C89,text-align:left
    class built-in-eng,installed-eng quesNodefmt

    classDef ansNodefmt fill:#FFAA05,stroke:#E37F17,text-align:left,white-space:nowrap
    class built-in,installed,other ansNodefmt

    linkStyle default font-size:20pt,color:#206C89
image
JessicaS11 commented 1 day ago

How about this as a few small changes? Feel free to take anything (but no obligation), and then let's merge

Thanks! Using monospace was a great idea.

It's exciting to know that GitHub [sort-of] renders mermaid!