molsonkiko / JsonToolsNppPlugin

A Notepad++ plugin providing tools for JSON like linting, querying, a tree view, and CSV conversion.
Apache License 2.0
85 stars 9 forks source link

Duplicate Name Severity Overstated and Overreacted #77

Closed JediSQL closed 2 months ago

JediSQL commented 2 months ago

We have a process that generates JSON documents. When I tried to "Pretty Print" one with JsonTools, we got this dialog box:

———————————————————————————
View syntax errors in document?
———————————————————————————
There were 1 syntax errors in the document. Would you like to see them?

(You can turn off these prompts in the settings (offertoshowlint setting))
———————————————————————————
Yes   No   
———————————————————————————

I chose "Yes" to see the error. This was reported:

Severity    Description
————————————————————————————————
BAD         Object has multiple of key "data"

We were getting alarmed that we would need to revise our process. But then I checked the standards.

ECMA—404, The JSON Data Interchange Syntax, 2nd Edition, December 2017:

Section 6 Objects

... does not require that name strings be unique...

IETF RFC 8259, December 2017

Section 4. Objects

The names within an object SHOULD be unique.

Based on the wording of the two relevant standards, I think it would be more appropriate to rate the severity of "Object has multiple of key..." as WARNING. The partner organization we send these JSON documents to has not complained that their JSON parser does not accept our documents.

In preparing sample code for this posting, I also found that the Pretty Print parser keeps only the last repeat of the non-unique name. Again, this does not faithfully support the flexibility of the standards.

Sample Duplicate Name JSON that is compliant with the wording of both standards yet generates the "Object has multiple of key... Severity = BAD" error message.

{
  "entity": "KLJDOFJDLKFJ",
  "date_of_extraction": "06/30/2024",
  "version": "1.0.3.0",
  "index_date_type": "date_of_occurance",
  "forms": [
    {
      "form_name": "THINGAMBOB",
      "form_id": "THINGAMBOB",
      "data": [
        {
          "XYZ": 2006627,
          "SASLabel": "Repeat Point",
          "is_interval_calc": null,
          "value": "Timepoint 1 (1 year)"
        }
      ],
      "data": [
        {
          "XYZ": 2006627,
          "SASLabel": "Repeat Point",
          "is_interval_calc": null,
          "value": "Timepoint 2 (2 years)"
        }
      ]
    }
  ]
}
molsonkiko commented 2 months ago

@JediSQL

Your suggestions are not actionable, and will not be considered. You are welcome to ask for more explanation, but don't waste your time trying to convince me to change my mind. It is literally impossible for JsonTools to remember multiple instances of a duplicate key. If you want a technical explanation, I can provide one.

Many JSON parsers (including those for Python, JavaScript, and C#) silently ignore all but the last instance of a duplicate key. Based on my own painful experience, I believe it is more responsible to warn people about an issue that is guaranteed to cause data loss, rather than ignoring it silently.

I think it would be more appropriate to rate the severity of "Object has multiple of key..." as WARNING.

There is no WARNING level, so that suggestion is not actionable. The levels are OK, NAN_INF, JSONC, JSON5, BAD, and FATAL, as described in the documentation. I am not going to add more levels, because it's not worth the trouble.

The partner organization we send these JSON documents to has not complained that their JSON parser does not accept our documents.

I'm sure the partner organization would be less than thrilled to discover that their parser was silently discarding some of the data you were sending them.