edgi-govdata-archiving / web-monitoring-differ

🔍 Node.js diffing service for the website monitoring project
MIT License
5 stars 0 forks source link

Move into pagefreezer-cli? #3

Closed titaniumbones closed 7 years ago

titaniumbones commented 7 years ago

We'd like to bring all of the web mointoring tos into a single repo, which we will rename (edgi-govdata-archiving/pagefreezer-cli#23). How would you feel about moving this work into that repo,, maybe also co-ordinating with other people doing similar work?

titaniumbones commented 7 years ago

@WestleyArgentum have you been thinking about this stuff at all? Have you seen some of the stuff that's happening in https://github.com/edgi-govdata-archiving/web-monitoring ?

WestleyArgentum commented 7 years ago

Hey, sorry I missed the call last friday, I really tried to hop on but it's a long story.

I don't 100% have a handle on the new setup, but tomorrow I'll try to figure it out and submit a PR to move this over.

WestleyArgentum commented 7 years ago

Hey @titaniumbones, I've been looking at the new setup and trying to figure out where this fits.

It looks like both monitoring-ui and monitoring-processing are querying page freezer for diffs right now, and I think this would be the service they would query instead.

Will there been regular friday meetings? If so I'd love to talk about this and the overall vision tomorrow.

dcwalk commented 7 years ago

@WestleyArgentum -- my understanding wasn't that there was a standing Friday meeting, pinged you on that conversation in the Slack.

RE: how the components interact-- as I've interpreted, monitoring-ui would eventually be pulling from a service and not directly, but @danielballan, @Mr0grog or @lightandluck would be better to confirm

danielballan commented 7 years ago

That's right. Because the monitoring-ui repo used to be pagefreezer-cli, there is still some legacy code in there that needs to be cleaned out -- hence the querying code you found. But the plan is for monitoring-ui to interact with monitoring-db (a Rails app), which in turns interacts with monitoring-processing. For the short-term, monitoring-db is making calls to Versionista, but in the long term (as we move to PageFreezer) monitoring-processing will make the calls and feed monitoring-db's SQL databases with diffs for human analysts to evaluate. See https://github.com/edgi-govdata-archiving/web-monitoring/issues/29 for reasonably current discussion on this architecture. And, as I said on Slack, I'd be happy to chat briefly today if you want a quick rundown.

danielballan commented 7 years ago

I'll post our conclusion from the call for @titaniumbones and @dcwalk: Since this is implemented as a web service, I think it actually makes sense to keep it as its own repo and integrate it with web-monitoring-processing via REST calls. Any future work on other diffing approaches should implemented in the web-monitoring-processing repo if possible, but I think it's fine to maintain this one here. Also, by incorporating we can test that our schema is sufficiently flexible (i.e., not tightly coupled to PageFreezer).

dcwalk commented 7 years ago

Thanks for the update -- it seems like it might still be beneficial to rename the repo to make clear it is in the web monitoring family?

i.e., "web-monitoring-differ"?

WestleyArgentum commented 7 years ago

Makes sense, I'll do that

danielballan commented 7 years ago

@WestleyArgentum I took the liberty of doing the name change myself, just now, so I could include a link to this repo edgi-govdata-archiving/web-monitoring#15