Closed conantp closed 1 year ago
Merging #216 (4888131) into main (e883508) will increase coverage by
0.05%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## main #216 +/- ##
==========================================
+ Coverage 72.51% 72.57% +0.05%
==========================================
Files 64 64
Lines 3493 3493
==========================================
+ Hits 2533 2535 +2
+ Misses 960 958 -2
Impacted Files | Coverage Δ | |
---|---|---|
..._backend/pipeline/generate_event_index_pipeline.py | 97.29% <ø> (ø) |
|
cdp_backend/file_store/functions.py | 80.85% <100.00%> (ø) |
|
cdp_backend/tests/file_store/test_functions.py | 100.00% <100.00%> (ø) |
|
cdp_backend/tests/test_utils.py | 100.00% <0.00%> (+25.00%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
Link to Relevant Issue
This PR relates to this discussion on Slack:
https://councildataproject.slack.com/archives/CMMNC8Y7P/p1664648853737539?thread_ts=1663622963.782599&cid=CMMNC8Y7P
Description of Changes
For the Asheville CDP instance, we encountered an issue where our event index / search was not being updated for new events. After investigating, I realized that the GCS index-chunks files were not being updates.
The generate_event_index_pipeline.py routine calls chunk_index, which in turn calls fs_functions.upload_file.
The upload_file function checks if the file exists on the remote resource - if so, it just returns the existing URI. As a result, there is no ability to overwrite existing index-chunk files.
https://github.com/CouncilDataProject/cdp-backend/blob/e8835080d2f563f850a346baba17f26da67fe822/cdp_backend/file_store/functions.py#L64
This PR adds an optional parameter, overwrite to the upload_file function. It modifies generate_event_index_pipeline.py to set overrwrite=true when upload_file is called.
I attempted to add tests for this modification in behavior, included cases to confirm the same URI is returned if the file does / does not exist and overwrite=True.
In test_functions.py, I modified the EXISTING_FILE_URI variable - I felt that it indicated a URI that was not in alignment with what the upload_file function should be expected to produce. If the given SAVE_NAME is included in the call to upload_file, the EXISTING_FILE_URI would include that as part of the URI.
I'm not super familiar with how the mock.patch calls work - but I believe they are changing the behavior of "cdp_backend.file_store.functions.get_file_uri" in a way that caused the tests to pass previously - though they should have failed. For example, previously, EXISTING_FILE_URI = "gs://bucket/test_file.json". However, the test parameters indicated that there was an existing file with URI = EXISTING_FILE_URI, but the requested upload was for SAVE_NAME ("fakeSaveName"). The resulting URI should be "gs://bucket/fakeSaveName"
I'm not completely familiar with writing tests in Python — any review or feedback is appreciated!