aschampion / rust-n5

Rust implementation of the N5 "Not HDF5" tensor file system format
Apache License 2.0
15 stars 3 forks source link

Python bindings #8

Closed aschampion closed 5 years ago

aschampion commented 6 years ago

To provide a lower installation barrier, more N5-specific python access to N5 than the excellent z5py, it would be nice to have python bindings via rust-n5 than supports:

This would be implemented in a separate, dependent python binding crate (possibly in this repo with a cargo workspace), but is tracked here.

Two main python binding libraries exist for rust (comparison):

On first examination I tend to prefer PyO3 (see this rather nice example). The rust numpy binding library currently uses rust-cpython, but plans to migrate (and has an in-progress PR) to PyO3 once it supports stable rust. Will revisit this issue once that has happened.

cc @clbarnes

clbarnes commented 6 years ago

Do you envision a numpy-like interface to datasets or would you be more inclined towards going a separate way, like the reference implmentation of zarr? Catching enough slicing and broadcasting edge cases to be usefully numpy-compatible is awkward (although in some cases, already implemented in z5).

aschampion commented 6 years ago

@clbarnes and I discussed this offline, but the reasonable minimum here is read/write access to dimension-complete, contiguous, non-strided slices for now, and only supporting broadcasting for scalar values. The numpy access will work via the existing ndarray access, so will not be zero-copy but should be reasonably fast.

aschampion commented 6 years ago

rust-numpy has migrated to PyO3; their release is blocked on the next release (0.3) of PyO3, but that's nearing. So that blocker has been removed.

clbarnes commented 6 years ago

@pattonw expressed some interest in this.

For the use of anyone who might attempt it, some of the sticking points I came across working on z5py:

In the unlikely event that we wanted anything more than an MVP, https://github.com/Quansight-Labs/uarray seems to be an attempt to formalise the numpy ndarray interface for non-numpy projects, including explicit handling of ufuncs and the like.

aschampion commented 5 years ago

This is fulfilled by https://github.com/pattonw/rust-pyn5