linnarsson-lab / loom-viewer

Tool for sharing, browsing and visualizing single-cell data stored in the Loom file format
BSD 2-Clause "Simplified" License
35 stars 6 forks source link

Better number rounding in JSON array conversion #144

Closed JobLeonard closed 6 years ago

JobLeonard commented 6 years ago

Currently, I use np.around(8) to remove unneeded precision from the expanded JSON files. This saves data on the hard-drive, as well as bandwidth when serving files.

(the reasoning for 8 digits of precision being enough for the viewer is that 4k * 4k = 16000000, so if we start with a zoomed out view on a 4k screen, this should be enough to zoom into a single pixel and still be precise enough down to the individual pixel)

However, while np.around() works exactly the way it should, I just realised that this is not what I actually want:

>>> import numpy as np
>>> t = np.array([1.2312415124124315, 4.1230123151231241325, 
            0.0000023412413512312412, 1231351231.12315942342130])
>>> t
array([  1.23124151e+00,   4.12301232e+00,
            2.34124135e-06,   1.23135123e+09])
>>> t.dtype
dtype('float64')
>>> t8 = np.around(t, 8)
>>> t8
array([  1.23124151e+00,   4.12301232e+00,
            2.34000000e-06,   1.23135123e+09])
>>> t8.tolist()
[1.23124151, 4.12301232, 2.34e-06, 1231351231.1231594]
>>> import json
>>> json.dumps(t.tolist())
'[1.2312415124124314, 4.123012315123124, 2.341241351231241e-06, 1231351231.1231594]'
>>> json.dumps(t8.tolist())
'[1.23124151, 4.12301232, 2.34e-06, 1231351231.1231594]'

The problem here is highlighted by the last two values: 2.34e-06 would ideally be 2.34124135e-06, and 1231351231.1231594 reduced to 1.23135123e09 or 1231351230 (which would be less characters in this case).

It's not a big issue at the moment - the dynamic range and order of magnitude of our data does not seem to produce large artefacts from this rounding error (mainly because we can't zoom yet and because few people use the viewer on 4k resolution).

Maybe there is a built-in numpy method for achieving this, otherwise this has to be done manually.

JobLeonard commented 6 years ago

... or I just use astype(np.float32), because that already has six to nine decimals of precision and we're converting to float32 on the client side anyway.