mverleg / pyjson_tricks

Extra features for Python's JSON: comments, order, numpy, pandas, datetimes, and many more! Simple but customizable.
Other
154 stars 23 forks source link

dump returns a byte value, is this normal? #66

Closed alessio-greco closed 4 years ago

alessio-greco commented 4 years ago

Using json.dump(shap_values, "shap_values-json.gz",compression=9) returns

b'\x1f\x8b\x08\x00\x02\xd4\xf3]\x02\xff\xec\x9d\xcb\x8e-\xbd\x95\x9c_E\xe8\xb1\xb4\x91\xbc\x93~\x95\x86\xd1h\xc0\x9ez\xd03\xc3/\xef\xf8\x82\x99$\xb3<\xf38\xd5jJ:\x7f\x9d]\xb9\x99\xe4\xbaF\xc4\xfa\xf7\xff\xf3o\xff\xf1\x1f\xff\xeb\x7f\xfc\xe7\x7f\xfd\xd7\x7f\xfe\xef\xff\xf8\x8f\x7f\xfbo\xff\xf8\xf7\x7f\xbf~\xd7?\xff\xf1-\xdf\xf2-\xdf\xf2-\xdf\xf2-\xdf\xf2-\xdf\xf2-\xdf\xf2-\xd

It still saves the data, but is this the normal behaviour?

mverleg commented 4 years ago

Indeed dump also returns the value, in addition to saving to file. It's a bit redundant in many cases, but I think it's easy to just not use it.

The value was available anyway so it seemed it might be useful to return in some cases. Probably not so much in compressed cases, but in the interest of consistency...

mverleg commented 4 years ago

I didn't really consider interactive use, mostly scripts... I guess in interactive use it would show up as output, which might be annoying?

alessio-greco commented 4 years ago

A "return_string" flag that make dump return the string might be useful when one wants both "dump" and "dumps" at the same time, but generally i guess one would use dumps for the string,

At times, the string value may be pretty heavy (in general) and long (for interactive use)

mverleg commented 4 years ago

In the current implementation, the string is being constructed anyway (dump calls dumps) so it doesn't cost much extra to return it.

(A streaming version would be interesting, but this library is a wrapper around the standard json library, so that seems a long shot...)