Open Psirus opened 7 years ago
Ah so you want to keep those "natural" 10^3-step units. We could write support for that :)
Yes, either engineering-notation ("10^3-steps") or just "leave my numbers alone"-mode ;)
Can someone guide how to proceed with this one?
I think a "leave numbers alone" mode is better for this bear at this stage, as it is a fairly generic bear. In the future, for better format control, we should create a JSON-Schema bear.
Anybody assigned to this? I would like to take it up
what should solution look like? The exponent part should be retained in every number or separate option should be kept?
I've briefly looked into this and it seems that the bear is basically just decoding and then reencoding with the JSON module from the standard library. This SO question discussed a couple of options, mainly monkey-patching or using the external simplejson
library. If either option would be okay, or you have a preferred third way, I could probably implement this myself and send you a PR.
I think the built-in library provides everything we need. We have to extend the JSONDecoder
and JSONEncoder
classes. JSONDecoder
has to parse numbers in a more advanced way. Instead of just converting them to an int/float which misses representational data, we could implement a simple yet more complex type storing representational data (maybe as a string). The JSONEncoder
has to be made smarter to detect this new type and convert it back properly into JSON.
Though just seeing, it seems that simplejson
is actually doing exactly this, it might take over the work for us and simplifies this task^^
I explored what I could do with simplejson or JSONDecoder & JSONEncoder. In my opinion work which is needed to make is very similar in both options. However, I had a problem. I tried the following approach. In the current solution it is created the corrected version, which is compared with input json file through diff. In my attempt I tried to change the corrected version by keeping input floats and saving them to correct json file. And then diff these two files. I realized this (I enclosed code below) by processing input json file to python objects with simplejson.loads(), where I used function pretty_floats() where I created object class PrettyFloat (code below), which wraps float as a string. And I created new json file with simplejson.dumps(), where I also had function which saved to file this float as a string. However, in this attempt the problem was that 23.4e-4 became saved as "23.4e-4" so as a string. Could you provide me any hints about this? Whether is it possible to make these corrections at all?
import simplejson as json
class PrettyFloat():
def __init__(self, s):
self.s = s
def __repr__(self):
return self.s
def get_s(self):
return self.s
def pretty_floats(obj):
return PrettyFloat(obj)
def default_floats(obj):
if isinstance(obj, PrettyFloat):
return obj.get_s()
json_content = json.loads(file, object_pairs_hook=OrderedDict, parse_float=pretty_floats)
corrected = json.dumps(json_content, default=default_floats)
I would recommend to submit a PR for such detailed questions instead of discussing implementation on the issue, then we get the whole picture^^
Cool analysis.
One thing I would like to see is whether other attempts at round-tripping handle this. http://augeas.net/ is the best one I know of. (https://github.com/hercules-team/augeas) Try and put this value into a .json file and see if augeas will let you keep it in your preferred style.
Ideally add it as a test case for https://github.com/hercules-team/augeas/blob/master/lenses/tests/test_json.aug
Maybe we could store the floats using Decimal
or to keep their precise number representation ?
Or https://pypi.org/project/ieee754bin/
Let's say I have a file
test.json
for a simulation I'm running,Now, when I run
coala -b JSONFormatBear --files=test.json
, it wants to change the number to2.34e-05
, which is of course mathematically equivalent. But I often have an intuition about numbers in a certain "exponent", changing that is often not helpful. Consider"length": 3.2e-3
as 3.2mm, when coala applies the patch it wants to, that changes to"length": 0.0032
, as in 0.0032m, which is not as intuitive.