openworm / open-worm-analysis-toolbox

A testing pipeline that allows us to run a behavioural phenotyping of our virtual worm running the same test statistics the Schafer Lab used on their worm data.
Other
48 stars 27 forks source link

JSON vs HDF5 #129

Closed MichaelCurrie closed 9 years ago

MichaelCurrie commented 9 years ago

Right now as is demonstrated by JSON demo.py, we have the ability to serialize NormalizedWorm and some other classes as JSON flat files. JSON is format familiar to web developers as it is often exchanged via HTTP between the client and server sides of web apps, but it isn't binary and so it might be very slow given the 50+ megabyte size of the arrays of floats we are looking to serialize.

HDF5 is the obvious alternative, and it's the one used natively by Matlab 12+ and some of the original Schafer Lab code.

This is not a decision that has to be made instantly since we can easily refactor at a later stage, since the serialization code is all encapsulated in a single class, JSONSerializer, but @JimHokanson and I were wondering if anyone had any strong thoughts on the matter.

MichaelCurrie commented 9 years ago

@joebowen @clinzy having some web development experience, maybe you two have thoughts on this? Thanks!

joebowen commented 9 years ago

Honestly, I think it's probably six one, half dozen another. My first thought was that HDF5 wouldn't be supported outside of Matlab, but I was wrong. It's possible to use a binary JSON like package, UBJSON, which is fully compatible with JSON, so I'm not really sure it matters. I'm personally more familiar with JSON, but looking at both, I can't really have a strong opinion one way or another.

MichaelCurrie commented 9 years ago

Thanks Joe. I'll leave JSONSerializer as is then. Anyone who feels we should change it can feel free to re-open this issue. Thanks