Closed stvnrlly closed 5 years ago
I think we can just go on blaming OANC. If they're going to provide a calendar, they have to expect that people will use it.
I think we can do both. If there's a way to provide it, it would be nice to have the accurate information. Having a tool to edit the information also gives us a place where we can say "our data comes from DC, so let them know when it's inaccurate".
I'm working on at least hiding the wrong info by manually editing meetings.json...
Will manually editing the file cause the incorrect one to get added back in when the meeting scraper runs again?
Well that's why I didn't just say I was editing the file. That's the logic I was implementing!
Ah, I gotcha.
@JoshData did any of your work make it into the repo?
I was just thinking that some kind of unique identifier for meetings could help solve this. I'm making this up because I don't know how they work, but maybe a hash would be helpful here?
Hashing doesn't really change anything. We're effectively using the ANC and meeting date/time as the identifier.
But, yes, this is working. Except I guess I'm the only one who can modify the meetings because the data is stored in a local JSON file.
To delete a meeting, I add "status": "invalid",
to the meeting record. Actually deleting it won't work because it would be added back next time the scraper is run. (https://github.com/codefordc/ancfinder/blob/master/ancfindersite/views.py#L79)
To add a meeting, I add "status": "manual",
to a new meeting record. Normally meetings that aren't seen in DC's calendar are reaped from our file, thinking they're for meetings that have been canceled or rescheduled. Marking them as manual prevents the meeting from getting reaped. (https://github.com/codefordc/ancfinder/blob/master/scripts/update_meeting_times.py#L137)
Okay, cool. I'd like to make the information more widely editable, but maybe that requires thinking about user permissions?
Also, is there something besides the date that we can use as an identifier? That way, we could just edit the object with changes and still have a way to check that we already scraped it. Or, we could add an edited
field to the object that includes the edits, and prefer that over the original if it exists.
Or, we could add an edited field to the object that includes the edits, and prefer that over the original if it exists.
Maybe. The whole record may still get reaped if the original disappears from dc.gov, e.g. if they update their calendar and drop the incorrect meeting then our edited information would disappear. That could be good or bad.
I'd be okay with that. I think that if DC makes an affirmative change, we should be able to assume that it is correct.
Should there be a way to update ANC meeting info the same way that commissioner info can be updated? This can help fix incorrect information on DC's site.
In order to avoid messing with the scraper, this could involve adding a
corrected_info
field to the meeting with the updated information, rather than overwriting it.