Sysmagine / SemanticDiff

Community support for SemanticDiff, the programming language aware diff for Visual Studio Code and GitHub.
https://semanticdiff.com
40 stars 0 forks source link

Moved Code for Json Files should not show as a difference #62

Open MiguelElGallo opened 1 month ago

MiguelElGallo commented 1 month ago

Is your feature request related to a problem? Please describe. I deal with reviewing changes to Infra as Code (JSON/ARM) , which are JSON files (2 to 8 MB) in which the sections are always generated in different order. Ask the Azure guys why that happen. I need to compare those JSON files, and validate the changes, but since sections are in different order, they are shown as differences, which in JSON those are not.

Describe the solution you'd like A flag to tell only show differences, do not mind the order, for JSON files.

Describe alternatives you've considered Write my own code, but the visualiation part will be a challenge.

Additional context None

mmueller2012 commented 1 month ago

Thanks for your feature request.

Can you give me an example of an old and new JSON document that contains such a move? A dummy example or a trimmed down version where you replace sensitive data with placeholder values would be fine.

Maybe we can add some logic specific to ARM templates, but I need to understand the use case a bit better for this.

MiguelElGallo commented 1 month ago

@mmueller2012 Thanks for the quick reply.

This is a simple example, these two files are identical, from a Json ARM, perspective.

But it shows moved without any changes. Which is the case, but on large files it gets, impossible to spot the "real" differences.

Let me know if this good enough as an example.

ver1.json ver2.json

mmueller2012 commented 1 month ago

Thanks for the example.

SemanticDiff already ignores reordering of keys, but that doesn't help in this case because the elements within an array change their order. We obviously cannot add a generic rule to ignore such changes for all JSON files, but we should be able to detect these templates based on the provided $schema value and enable more specific rules.

Based on the schema mentioned in the JSON file, this seems to mostly affect the top-level keys functions and resources, but the schema references 333 subschemas that we may also need to consider. Does the reordering only affect these top-level keys or is it present throughout the document?

MiguelElGallo commented 1 month ago

It is present throughout the document. For example inside resources::properties could be in different order.

mmueller2012 commented 1 month ago

Manually writing the rules for all 333 schemas is not feasible, but it should be possible to write a script that extracts the location of all arrays. This still requires manual annotation of which array is unordered, but it sounds much more doable. We will take a look at it.

MiguelElGallo commented 1 month ago

@mmueller2012 sorry forgot to tag you...

MiguelElGallo commented 1 month ago

@mmueller2012 do you think this is doable ?

mmueller2012 commented 4 weeks ago

@MiguelElGallo We took a quick look and found that the schema contains a lot of oneOf entries. They complicate things because SemanticDiff needs to distinguish the object type in order to apply rules conditionally. We haven't decided on way forward yet, as we are currently finalizing the next SemanticDiff release. We will look into this further after the new version is released.

MiguelElGallo commented 4 weeks ago

@mmueller2012 Thanks! Let me know if you decide anything! Good luck with the release!