Closed titaniumbones closed 7 years ago
@WestleyArgentum have you been thinking about this stuff at all? Have you seen some of the stuff that's happening in https://github.com/edgi-govdata-archiving/web-monitoring ?
Hey, sorry I missed the call last friday, I really tried to hop on but it's a long story.
I don't 100% have a handle on the new setup, but tomorrow I'll try to figure it out and submit a PR to move this over.
Hey @titaniumbones, I've been looking at the new setup and trying to figure out where this fits.
It looks like both monitoring-ui and monitoring-processing are querying page freezer for diffs right now, and I think this would be the service they would query instead.
Will there been regular friday meetings? If so I'd love to talk about this and the overall vision tomorrow.
@WestleyArgentum -- my understanding wasn't that there was a standing Friday meeting, pinged you on that conversation in the Slack.
RE: how the components interact-- as I've interpreted, monitoring-ui would eventually be pulling from a service and not directly, but @danielballan, @Mr0grog or @lightandluck would be better to confirm
That's right. Because the monitoring-ui repo used to be pagefreezer-cli, there is still some legacy code in there that needs to be cleaned out -- hence the querying code you found. But the plan is for monitoring-ui to interact with monitoring-db (a Rails app), which in turns interacts with monitoring-processing. For the short-term, monitoring-db is making calls to Versionista, but in the long term (as we move to PageFreezer) monitoring-processing will make the calls and feed monitoring-db's SQL databases with diffs for human analysts to evaluate. See https://github.com/edgi-govdata-archiving/web-monitoring/issues/29 for reasonably current discussion on this architecture. And, as I said on Slack, I'd be happy to chat briefly today if you want a quick rundown.
I'll post our conclusion from the call for @titaniumbones and @dcwalk: Since this is implemented as a web service, I think it actually makes sense to keep it as its own repo and integrate it with web-monitoring-processing via REST calls. Any future work on other diffing approaches should implemented in the web-monitoring-processing repo if possible, but I think it's fine to maintain this one here. Also, by incorporating we can test that our schema is sufficiently flexible (i.e., not tightly coupled to PageFreezer).
Thanks for the update -- it seems like it might still be beneficial to rename the repo to make clear it is in the web monitoring family?
i.e., "web-monitoring-differ"?
Makes sense, I'll do that
@WestleyArgentum I took the liberty of doing the name change myself, just now, so I could include a link to this repo edgi-govdata-archiving/web-monitoring#15
We'd like to bring all of the web mointoring tos into a single repo, which we will rename (edgi-govdata-archiving/pagefreezer-cli#23). How would you feel about moving this work into that repo,, maybe also co-ordinating with other people doing similar work?