store short numpy arrays as strings.

mpgriff commented 1 month ago

Hi @Crimson-Crow,

This is a great little package, thank you for sharing!

I have a suggestion for an enhancement, which is providing an option to store small numpy arrays as directly as strings.

In my fork I've made this functionality with a threshold based on the number of elements, e.g. less than 100 elements is stored as string, more than 100 is stored as base64 encoding: https://github.com/mpgriff/json-numpy/blob/main/json_numpy.py

I think it would be cool if this binary_threshold flag could be added to the patch() call.

Perhaps there are consequences/implications I'm not aware of. Anyways, thanks again for sharing.

Crimson-Crow commented 1 month ago

Hi,

Glad it has proved to be useful to you!

Your approach raises an interesting point. Do your benchmarks show that repr serialization is faster for smaller arrays? I'm not opposed to the implementation of string serialization if it proves to be beneficial (and better than just using .tolist() and serializing the result).

However, running the tests with python -m unittest discover on your fork fails some tests for loss of precision or failure to deal with some types (appears to be numpy scalars). Also, the security implications of using eval() on untrusted input is not something I'd like to get into.

I'd be happy to consider and potentially integrate the feature if you're able to address these concerns.

Thanks for your input and feel free to reach out for any further concerns/enhancement ideas!

mpgriff commented 1 month ago

Thanks for your consideration!

Perhaps the .tolist() function is more or less what I had in mind. I'll take a run against the package unit-tests, should I find a potential implementation.

Crimson-Crow / json-numpy

store short numpy arrays as strings. #4