alephdata / followthemoney

Data model and processing tools for investigative entity data
https://followthemoney.tech
MIT License
211 stars 50 forks source link

Diffing streams #1234

Open jbothma opened 1 year ago

jbothma commented 1 year ago

It would be nice to be able to diff two streams and see

This needs to ignore differences in the ordering, of entities in the stream, property values in an entity

It would also be nice if it could

This, along with the ability to extract one or more enitities using a whitelist of IDs would make it much easier to pull out a sample of specific entities, and inspect differences.

My use case is when working on code producing a stream, and wanting to see the differences between a known version and a new version.

pudo commented 1 year ago

Some fraction of this is implemented here, but it's so bad we never really advertised it :/

https://github.com/opensanctions/opensanctions/tree/main/contrib/delta