Closed hancush closed 6 years ago
FWIW, we have the correct version on file in our metro-pdf-merger
S3 bucket.
Also, to be clear, the correct version of the report comes back when you download the report from the board report and event pages. The only place we are showing the wrong report, is the PDF pane on the board report page.
In the OCD and Councilmatic database, the url for a Board Report points to the latest
version of the pdf (generated here).
When Councilmatic needs to render a PDF, it visits https://pic.datamade.us/lametro/document/
, where the property image cache does some work:
key
, and then, looks for that Key in our councilmatic-document-cache
S3 bucket. What's the issue? We use the Board Report's URL as the key, and this URL remains stable, even when the document that Legistar serves changes. The PIC would not know about such changes, since it already cached an earlier version.
[x] Immediate solution: delete the entry in the S3 bucket, and revisit the page in question: https://boardagendas.metro.net/board-report/2018-0140/
[x] Change how we generate the S3 keys, or change how we scrape the pdf URLs, or force refresh the cache.
@reginafcompton would it be too hamfisted / difficult to connect services, to update the cached image when the bill changes?
We need to ensure that the property-image-cache has the most up-to-date PDFs of board reports. An effective strategy for doing this: delete the old PDFs from the S3 bucket, whenever a bill gets updated (then, the document
route will create a new entry in AWS, when someone visits a board report page on the Councilmatic site).
After consulting with @evz, a good solution entails devising a new management command that does the following:
import_data
delete
function in S3 simply "does not remove any objects" if the bucket does not contain the specified key.)The logic we discussed for consistent treatment of reports and PDF rendering is to:
@shrayshray - Councilmatic now has a script that will refresh the document cache, every time a bill or event changes. We'll review this script, merge it, and add it to the data import pipeline early next week.
For the PDF rendering, I'll add the logic you note – though that seems like its more related to this issue: https://github.com/datamade/la-metro-councilmatic/issues/345. So, I'll keep track of any relevant updates there.
@shrayshray - I've added the script for refreshing the document cache to the Metro data pipeline! Closing this issue.
Board report 2018-0140 was slated for a meeting during the summer but was postponed. While the link we have on file in the bill documents table resolves to the correct file in Legistar, the URL generated by the
full_text_document_url
template tag still points to the July version on pic.datamade.us.