data-apis / array-api-compat

Compatibility layer for common array libraries to support the Array API
https://data-apis.org/array-api-compat/
MIT License
69 stars 22 forks source link

Add ndonnx #150

Closed adityagoel4512 closed 2 months ago

adityagoel4512 commented 2 months ago

ndonnx is an ONNX-backed array library. It implements the Array API standard and support is actually directly in the main namespace (we are big fans of the standard). ndonnx enables users to export Array API compliant code to ONNX automatically, a process which traditionally required quite a lot of code duplication.

Since we don't need an array-api-compat wrapper for ndonnx, adding it here would be somewhat similar to the JAX approach - we also run array-api-tests over at ndonnx itself.

Would it be acceptable to add ndonnx here? Happy to contribute it myself.

asmeurer commented 2 months ago

If a package is full array API compatible, it doesn't need to add anything here. It will just work out of the box. As long as you define __array_namespace__ correctly, the array_namespace() helper will use it when passed an ndonnx array.

The only code in array-api-compat for libraries like JAX or NumPy 2.0 that are fully array API compatible is helper functions like is_numpy_array, which are needed for downstream code that needs to write NumPy-specific paths for things like performance or to workaround some limitations of the array API. See https://data-apis.org/array-api-compat/helper-functions.html.

If you anticipate packages needing an is_ndonnx_array helper, we can add that.

Otherwise, there's nothing really to add here.

We could perhaps add a page to the documentation listing other array API compatible packages like ndonnx that aren't explicitly part of array-api-compat.

adityagoel4512 commented 2 months ago

If you anticipate packages needing an is_ndonnx_array helper, we can add that.

I think this is main utility that would be beneficial. In particular, for ONNX export you cannot use any implicitly eager constructs part of the Array API standard (like __float__) since there is no available data to collect and instead we raise an exception. In such a place you would branch and provide an ONNX specific code path. Having is_ndonnx_array would be very valuable for this.

asmeurer commented 2 months ago

You can make a PR for this if you like.

Although it sounds like this is a special case of the behavior that was specified at https://github.com/data-apis/array-api/pull/652. I wonder if we shouldn't try to make something more general. I'm also wondering if the proper way for libraries to handle this should be to catch ValueError and branch based on that, given that's now what the standard specifically says should happen.

Aside from that specific change, there haven't been any other specific lazy/eager things added to the standard, but there have been other discussions about potentially doing so (see https://github.com/data-apis/array-api/issues/748 and https://github.com/data-apis/array-api/discussions/777). I wouldn't want to jump the gun and try to add any helper functions here for things that might later be standardized.

adityagoel4512 commented 2 months ago

I'm also wondering if the proper way for libraries to handle this should be to catch ValueError and branch based on that

I think something like this is reasonable. The problem with branching based on ValueError in particular is that you can't pin down precisely where an exception originated from (it could just be an unhandled exception a few stack frames down).

I wouldn't want to jump the gun and try to add any helper functions here for things that might later be standardized.

Agreed! There are other valid reasons to branch such as where an ONNX operator exists that would more performantly/concisely express some application logic, but which is not directly in the Array API standard.

rgommers commented 2 months ago

Very nice, thanks for sharing @adityagoel4512!

I think adding is_ndonnx_array makes sense. And it would be good to comment on https://github.com/data-apis/array-api/issues/748 to ensure that ndonnx is captured there - there is significant overlap between ndonnx, jax and dask there, even if the non-eager-eval requirement of ndonnx is even stronger than that for jax and dask.