pajachiet / pymongo-schema

A schema analyser for MongoDB, written in Python.
GNU Lesser General Public License v3.0
74 stars 13 forks source link

Compare module - detailed diff and summary diff views #17

Closed JulieRossi closed 7 years ago

JulieRossi commented 7 years ago

Implementing the option to display the full differences in the compare module, a choice needs to be made regarding the output format.

In the detailed diff, we'll have something like (potentially with more information like description, sensitive, etc.):

Hierarchy Previous Schema New Schema
borough {"type": "string", "prop_in_object": 1.0, "count": 25359, "types_count": {"string": 25359}} None
_borough None {"type": "string", "prop_in_object": 1.0, "count": 25359, "types_count": {"string": 25359}}
cuisine {"type": "string"} {"type": "ARRAY"}
grades.grade {"type": "string", "prop_in_object": 3.6856, "count": 93463, "types_count": {"string": 93463}} None

Currently, the summary view is like:

Hierarchy Previous Schema New Schema
borough None
None _borough
cuisine {"type": "string"} {"type": "ARRAY"}
grades grade None

Both views do not have the same hierarchy in the case of a missing / new field.

A solution would be to use the same hierarchy as the detailed view in the summary one, repeating the field name :

Hierarchy Previous Schema New Schema
borough borough None
_borough None _borough
cuisine {"type": "string"} {"type": "ARRAY"}
grades.grade grade None

What do you think about this solution ? Does it seem clear to you ?

Another possibility would be to give more details about the missing / new field in the schema columns.

But this might be confusing.

@pajachiet , @aureliengervasi which of these three solutions (or maybe another one) would you choose ?

aureliengervasi commented 7 years ago

Hi Julie,

I just want to understand : having 2 different hierarchies between the detailed and summary view is a technical issue for you (= difficult to implement) ? Or is it a layout issue (= difficult to understand) ?

In the first case, I would indeed prefere the first solution you suggest, limiting the amount of information given in the summary view.

JulieRossi commented 7 years ago

It was more a layout issue (it needs a little more code too but I don't see it as a problem if the result is better)

aureliengervasi commented 7 years ago

Let's go then for now with repeating the field name in the hierarchy column and the schema column (your first solution). We can make the code evolve later if need be.