clingen-data-model / clinvar-streams

1 stars 0 forks source link

57: Fix false ingest differences. #59

Closed tbl3rd closed 1 year ago

tbl3rd commented 2 years ago

Messages are encoded as JSON which lacks sets. Data that are semantically unordered are encoded as JSON arrays, which forces an ordering that creates spurious differences between messages.

Now that mangled data is confined to the content field in the message, which is further encoded as a string which must be parsed into JSON before being decoded into EDN.

Consequently, pass JSON message file contents through decode to lift them into EDN and handle the stringified content field, then pass the resulting decoded JSON as EDN to differ? to detect differences between messages.

tbl3rd commented 2 years ago

As usual, I don't know how to build this or run tests. And this still has to be inserted into the input stream somehow. @theferrit32, @tnavatar, and I discussed some approaches to that.