coala / coala-bears

Bears for coala
https://coala.io/
GNU Affero General Public License v3.0
295 stars 580 forks source link

JSONFormatBear complains about number representation #2062

Open Psirus opened 7 years ago

Psirus commented 7 years ago

Let's say I have a file test.json for a simulation I'm running,

{
    "expansion-coefficient": 23.4e-6
}

Now, when I run coala -b JSONFormatBear --files=test.json, it wants to change the number to 2.34e-05, which is of course mathematically equivalent. But I often have an intuition about numbers in a certain "exponent", changing that is often not helpful. Consider "length": 3.2e-3 as 3.2mm, when coala applies the patch it wants to, that changes to "length": 0.0032, as in 0.0032m, which is not as intuitive.

Makman2 commented 7 years ago

Ah so you want to keep those "natural" 10^3-step units. We could write support for that :)

Psirus commented 7 years ago

Yes, either engineering-notation ("10^3-steps") or just "leave my numbers alone"-mode ;)

aashraybhandar1 commented 7 years ago

Can someone guide how to proceed with this one?

jayvdb commented 7 years ago

I think a "leave numbers alone" mode is better for this bear at this stage, as it is a fairly generic bear. In the future, for better format control, we should create a JSON-Schema bear.

danishprakash commented 7 years ago

Anybody assigned to this? I would like to take it up

akshkr commented 7 years ago

what should solution look like? The exponent part should be retained in every number or separate option should be kept?

Psirus commented 6 years ago

I've briefly looked into this and it seems that the bear is basically just decoding and then reencoding with the JSON module from the standard library. This SO question discussed a couple of options, mainly monkey-patching or using the external simplejson library. If either option would be okay, or you have a preferred third way, I could probably implement this myself and send you a PR.

Makman2 commented 6 years ago

I think the built-in library provides everything we need. We have to extend the JSONDecoder and JSONEncoder classes. JSONDecoder has to parse numbers in a more advanced way. Instead of just converting them to an int/float which misses representational data, we could implement a simple yet more complex type storing representational data (maybe as a string). The JSONEncoder has to be made smarter to detect this new type and convert it back properly into JSON.

Makman2 commented 6 years ago

Though just seeing, it seems that simplejson is actually doing exactly this, it might take over the work for us and simplifies this task^^

damian1996 commented 6 years ago

I explored what I could do with simplejson or JSONDecoder & JSONEncoder. In my opinion work which is needed to make is very similar in both options. However, I had a problem. I tried the following approach. In the current solution it is created the corrected version, which is compared with input json file through diff. In my attempt I tried to change the corrected version by keeping input floats and saving them to correct json file. And then diff these two files. I realized this (I enclosed code below) by processing input json file to python objects with simplejson.loads(), where I used function pretty_floats() where I created object class PrettyFloat (code below), which wraps float as a string. And I created new json file with simplejson.dumps(), where I also had function which saved to file this float as a string. However, in this attempt the problem was that 23.4e-4 became saved as "23.4e-4" so as a string. Could you provide me any hints about this? Whether is it possible to make these corrections at all?

import simplejson as json

class PrettyFloat():
    def __init__(self, s):
        self.s = s
    def __repr__(self):
        return self.s
    def get_s(self):
        return self.s

def pretty_floats(obj):
    return PrettyFloat(obj)

def default_floats(obj):
    if isinstance(obj, PrettyFloat):
        return obj.get_s()

json_content = json.loads(file, object_pairs_hook=OrderedDict, parse_float=pretty_floats)
corrected = json.dumps(json_content, default=default_floats)
Makman2 commented 6 years ago

I would recommend to submit a PR for such detailed questions instead of discussing implementation on the issue, then we get the whole picture^^

jayvdb commented 6 years ago

Cool analysis.

One thing I would like to see is whether other attempts at round-tripping handle this. http://augeas.net/ is the best one I know of. (https://github.com/hercules-team/augeas) Try and put this value into a .json file and see if augeas will let you keep it in your preferred style.

Ideally add it as a test case for https://github.com/hercules-team/augeas/blob/master/lenses/tests/test_json.aug

jayvdb commented 6 years ago

Maybe we could store the floats using Decimal or to keep their precise number representation ? Or https://pypi.org/project/ieee754bin/