gnieh / diffson

A scala diff/patch library for Json
Apache License 2.0
313 stars 51 forks source link

Consider adding a standard Diff format Pretty Printer #431

Open davesmith00047 opened 2 months ago

davesmith00047 commented 2 months ago

Hi,

Just a suggestion and perhaps this exists already and I missed it, but It would be really nice if you could pretty print to a standard diff.

At the moment, you provide things like circe instances to allow the diff to be turned into JSON - which is great - but what I really want is a human readable diff, like:

-val foo="hello"
+val foo="world"

The kind of output that json-diff produces:

image

satabin commented 1 month ago

Hi,

Thanks for the suggestion, this is interesting, and we could think about something like that. However, diffson is a library and aims at staying a library, not a CLI tool, so basically it represents the diffs as structured objects, not strings. It also supports several standardized patch formats, with different capabilities, all of them being structured JSON objects. Printing a JSON Patch with a test operation might be challenging as to how to represent it in this format for instance.

A better alternative, would probably be to support a new output format that is suitable for these use cases (I will need to dig into the json-diff format for instance).

davesmith00047 commented 1 month ago

Thank you for the thoughts, all seems reasonable. I won't get into a protracted debate over it, but just wanted to offer the view from my side of the fence for your consideration:

I guess to some extent it depends what you think the use-case of a diffing library is. What do you think it is? Maybe we just have different different requirements / expectations. 😄

I have two use cases:

  1. Machine readable diff that I can manipulate in code in order to - say - filter using rules and then act on. What does 'act on' mean? Could be metrics, I suppose? I'm simply using it to count errors to look for hot spots. But for me it's more likely to be...
  2. Showing the differences to a human, in logs or tests or reports out of Scala-CLI scripts or something. (I'm having to re-diff in json-diff for this, currently.)

So for the latter, what I need (maybe this exists and I missed it?) is a printing mechanism. Perhaps a typeclass or something that helps me understand what I need to implement to get printing to work. With that in place, it would then be nice if there was a pre-bundled git flavoured printer, for example, since everyone is used to seeing that and making sense of it, which would solve the problem for 90% of people, probably.

Also yes it's a library, but if I want to use the library in a script, say, then currently I can do that for structurally diffing two JSON blobs... but then what? If I want to show the results to anyone, I have to know how to generate a nicely formatted diff and manually coding it up.

Circe is a very much a library, and you can use it to nicely print JSON. I think you should be able to keep your library as a library, and provide a mechanism for printing pretty diffs.

Just my ten cents, hope it helps. Of course if you just feel it's beyond the scope of the library - no problem - feel free to close the issue. 🙂

satabin commented 1 month ago

Thanks for the answer. I think the main use that I intended is to have a diff format to communicate between services based on standard formats. So clearly it was to be used by machines.

The only printing mechanism which exists so far in the library, is the JSON serialization of the generated diffs, which means it shows a JSON structure of the diff.

I see your use case, and it's not something I had in mind. Of course this can be added. I just think it would require a different diff format than the one supported right now, since some operations are hardly printable in this format (test is one such example).

If the json-diff format is interesting, I can see how to add it, and this would have a Show instance that represents something similar.

One thing though: the json-diff tool shows a diff in the context of the input. A diff alone does not have all the information needed to be printed, since it won't have the entire content of the source to print as unchanged. So printing a diff, with no input provided together with it, won't help much here.

I think the big difference is that here you want to see the result of the diff, as a human, in a human readable format. This is already too late to happen if you only have the patch at hand. So the way to go would be a new patch format, that is a human readable string in this case.