Open tommyhe6 opened 9 months ago
I think we should allow for a keyword argument that dictates to order the fields alphabetically. As for our structs order does matter.
I think we should allow for a keyword argument that dictates to order the fields alphabetically. As for our structs order does matter.
Some other tools can mess up the order, just like Postgres as the OP has mentioned
The fundament issue is that just because two jsons are equal, that does not mean that their dataframe representation is equal. This is true in any case in which one framework contains more information than the other. Polars DataFrames contain information about the order of their columns. See this on the main page of json.org:
An object is an unordered set of name/value pairs.
One way to solve this problem is to always ensure the columns from a json file are represented by polars in alphabetical order as @ritchie46 suggested. This may confuse people with simple json files when they find their schema has been reordered (if we default it to True
as I think perhaps we should), but a quick sentence in the API would explain that.
Checks
Reproducible example
Log output
Issue description
read_json
behavior should be independent of the ordering of the attributes under most common definition of json. Notably, this caused problems when serializing tojson
, writing thejson
to some database (postgres in my case), then writing back to a polars df.Expected behavior
Both df would be the same, specifically the first.
Installed versions