OasisLMF / OasisPlatform

Loss modelling platform.
BSD 3-Clause "New" or "Revised" License
42 stars 16 forks source link

Lot3 - File caching issue #1029

Open sambles opened 2 months ago

sambles commented 2 months ago

Lot3 - File caching issue

Issue Description

The file caching in the OasisDataManager package is based on the filename rather than the hash of the file data. This causes a problem with filenames like analysis_1_inputs.tar.gz which is the input tar (oasis files)

Example

  1. Generate inputs With loc_file = A --> File stored analysis_1_inputs.tar.gz
  2. Run Generate Losses loc_file = A --> File cached analysis_1_inputs.tar.gz
  3. Update portfolio with loc_file = B
  4. Generate inputs With loc_file = B --> File overwritten with same name analysis_1_inputs.tar.gz
  5. Generate Losses:
run_analysis[8fa67556-5a22-4c79-885c-d78f7c4c90d7]: {'oasislmf': '2.3.4', 'ktools': 'fmcalc : version: 3.12.2 - git update: ', 'platform': '2.3.4'}
[2024-05-02 09:12:49,079: INFO/ForkPoolWorker-1] Get from Cache: oasis/server/analysis_1_inputs.tar.gz

(This is the original analysis_1_inputs.tar.gz Tar file from Step 1

sambles commented 2 months ago

Stephane: I don't know how fast it is to have a good hash on a big file but that would be my first try. second would be date of last change if we can access it

Try changing the caching logic in https://github.com/OasisLMF/OasisDataManager/blob/b0dccf5cbae3d29026a045ed54dde1f95f8d09bf/oasis_data_manager/filestore/backends/base.py#L175-L226