foretold-app / widedomain

An experiment to create mathematical models with clean interfaces
MIT License
8 stars 2 forks source link

Optional rounding of XYShape #47

Open OAGr opened 4 years ago

OAGr commented 4 years ago

Right now distributions in the dist builder result in a lot of unnecessary precision. For instance,

"xs":[12.919944318760631,12.968993194817127,13.018042070873623,13.067090946930119,13.116139822986614,13.16518869904311,13.214237575099606,13.263286451156102,13.312335327212597,13.361384203269093,13.410433079325589,13.459481955382085,13.50853083143858,13.557579707495076,13.606628583551572,13.655677459608068,13.704726335664564,13.75377521172106,13.802824087777555,13.851872963834051,13.900921839890547,13.949970715947043,13.999019592003538,14.048068468060034,14.09711734411653,14.146166220173026,14.195215096229521,14.244263972286017,14.293312848342513,14.342361724399009,14.391410600455504,14.440459476512,14.489508352568496,14.538557228624992,14.587606104681488,14.636654980737983,14.685703856794479,14.734752732850975,14.78380160890747,14.832850484963966,14.88189936102

This takes up a lot of extra space and doesn't add anything really. There is a way on the server to request rounding, but it probably doesn't currently work because it's not supported by the DistPlus ReasonML library.

A crude method would be something like, "use n digits of precision for each number". One downside to this that it's possible that the entire distribution may be between very precise numbers, like 3.48382 to 3.48398.

Ideally there would be some way of approximating how many digits of precision is reasonable to use, or having different floats use different levels of precision.

Note that these values are sent via the GraphQL API and saved to the database, so shortness would likely help a fair bit. It also takes time to render them; I imagine improvements here could speed things up a fair bit.

skosch commented 4 years ago

Honestly, I wouldn't worry about this. AFAIK JS engines use 32 bits for floats across the board, and so does PostgreSQL's real type, so there's really nothing we can do. Also, these numbers only look big when written out in decimal. In exponent/mantissa format, 4 bytes is 4 bytes regardless of whether you chop off the decimals or not.

OAGr commented 4 years ago

I think it would make a difference for graphql/api. Sending distributions over client-server now does take a fair amount of time. I imagine that reducing this could reduce that by 1/2 or less. My impression is that when it's sent as a GraphQL param, it encodes it in JSON, and sent as basically a string, though I'm not sure.

It's also something we could test experimentally, maybe with a simpler system (like, something that just trunctates after 2 decimal places)

skosch commented 4 years ago

Okay, in that case I think an easier solution is to pack the numbers into a Float32Array, and send that over the wire as UTF-8 (via TextEncoder/TextDecoder browser API). That should be quite fast.

OAGr commented 4 years ago

Huh, that seems neat, it could be a really good fit.