TorchDSP / torchsig

TorchSig is an open-source signal processing machine learning toolkit based on the PyTorch data handling pipeline.
MIT License
170 stars 38 forks source link

Accelerate saving data to disk using different compression #180

Closed gvanhoy closed 1 year ago

gvanhoy commented 1 year ago

Swap pickle for np.savez for raw data.

gvanhoy commented 1 year ago

Although some results suggest there should be faster ways to compress, in local tests, swapping between Pickle and numpy's tobytes() did not show a significant effect in the time it takes to generate/store Sig53 or WidebandSig53.

Numpy's tobytes() may be better to use if we want datasets to be more portable, but to enable portability, we also need to handle labels, which can be difficult to convert to binary in a way that's portable.

For portability, we can consider providing scripts for unpacking the data in other languages/environments.