hdmf-dev / hdmf

The Hierarchical Data Modeling Framework
http://hdmf.readthedocs.io
Other
46 stars 24 forks source link

[Feature]: Remove zarr from core dependencies #1138

Open rly opened 1 week ago

rly commented 1 week ago

What would you like to see added to HDMF?

1136 added Zarr as a core dependency in order to import it for use in src/hdmf/data_utils.py and testing. This increases the chance of dependency conflicts, increases the initial import time, and is not consistent with the idea of keeping hdmf-zarr as a separate backend and package. I would rather have Zarr be an optional package, where if it is installed, we take different action. This adds some complexity to the code though.

If we decide to keep Zarr as a core dependency, we need to limit the version to zarr < 3 because of the upcoming breaking changes.

What solution would you like?

Refactor dependency on zarr

Do you have any interest in helping implement the feature?

Yes.

rly commented 1 week ago

@mavaylon1 what do you think?

mavaylon1 commented 1 week ago

@mavaylon1 what do you think?

@mavaylon1 what do you think?

I added zarr as a core dependency since we should always run the test for appending to zarr. We have hdf5 as a core dependency, so it makes sense to support zarr the same way.

I have no issues with limiting it to a version to allow us to adapt to changes easier.

As for the separation, my thought is that the test needs to exist where the code is, and this test is for a core functionality (unlike TermSet and LinkML).

rly commented 1 week ago

I agree that the test should live where the code is. Sorry I wasn't clear - I meant to suggest something like https://github.com/hdmf-dev/hdmf/blob/dev/src/hdmf/utils.py#L18-L23

Here, we could have a flag saying whether Zarr is installed and then check whether Zarr is installed before doing the isinstance(data, zarr.ZarrArray): check

mavaylon1 commented 1 week ago

Yeah and just have a unit test skip like with the TermSet stuff if zarr isn't installed. I suppose I see the zarr differently from LinkML in that it's a main feature we are supporting. I still think treating it equally to hdf5 makes sense, but the optional test isn't bad either.

rly commented 1 week ago

If we want to support Zarr as a main feature, then I think we should be ready to merge hdmf-zarr into hdmf. There would be no point to having it as a separate package, just extra hassle. I do not think we are quite at that point yet, since we are still working out some of the kinks, but hopefully soon. What do you think?

mavaylon1 commented 1 week ago

Yeah it makes sense to merge when ready. That could be the big finish on both having HDMF-Zarr ready and when we feel happy with a cutdown HDMF. I'm fine with optional test. I'll change it.

rly commented 1 week ago

Great, thank you!