manzt / numcodecs.js

Buffer compression and transformation codecs
MIT License
29 stars 6 forks source link

Destructure compressorConfig in constructor? #5

Closed manzt closed 4 years ago

manzt commented 4 years ago

In the python lib all codecs have a static from_config and instance get_config methods. I've added the fromConfig method for the current codecs, but I'm thinking of removing this in favor of de-structuring the compressor config in the constructor in stead. From what I can tell, the decision for the two methods seems to be related to the registry set up in numcodecs, and since this is something we won't be providing directly in this library, I'd find it more intuitive to have this API:

Change

const config = { id: 'gzip', level: 1};

// current
const codec = GZip.fromConfig(config);
const codec = new GZip(1); 

// proposed
const codec = new GZip(config); // just destructures { level }

From what I've seen, the args in the config obj are usually just **kwgs'd in the python constructors anyways. Maybe I'm missing something. At minimum, I should probably make the codec constructor use an interface.

gzuidhof commented 4 years ago

The reason I had to shy away from constructors (in Zarr itself) and instead having a static "factory" functions is because they can be async. There is no way to do anything async in a constructor (unless you add some sort of "ready" field the user has to check, and it's inevitable there will be a codec that needs to do something async to setup. Also if there is an operation that can fail, what do you return from the constructor? In JS you can return anything from it, but I think typescript may be a bit more strict (for good reason).

I made peace with that it looks less elegant ^^, in TypeScript you can make the constructor private to let users know they should not be calling it manually.

This also allows you to do some additional setup (start some webworker maybe?), this allows the user to kick that off, and if you await this have some guarantees that it's all ready on future function calls (that can then even be synchronous as they don't have to await a ready field).

Hope this makes sense, not sure it's clear reading what I wrote

manzt commented 4 years ago

That's a good point. I think my solution to allowing for using a constructor is to make the encode and decode calls async. Keeping the fromConfig static method is useful as that's what zarr-python uses under the hood, but I think it is preferable to allow codecs to be instantiated directly.

import { Blosc } from 'numcodecs';
const codec = new Blosc({ clevel: 4 });
const decoded = await codec.decode(arr); // won't resolve until codec is initialized.