PermafrostDiscoveryGateway / viz-staging

PDG Visualization staging pipeline
Apache License 2.0
2 stars 1 forks source link

Handle appending vectors to a tile that is locked #5

Closed robyngit closed 2 years ago

robyngit commented 2 years ago

When the viz-staging step is run in parallel during the viz-workflow, errors arise when two processes are trying to save polygons to the same tile file. Some of the errors that came up during testing that I believe are related to this issue are:

The TileStager.save_tiles() method should check if a file is locked for writing and then wait for it to be available before trying to append to it. Alternatively, if a tile already exists, the TileStager could save the tile to a separate tile, then tiles could be merged at the end of staging.

robyngit commented 2 years ago

Used filelock within the TileStager.save_tiles method, which should prevent another process from writing to the same tile at the same time. Tested by re-staging the same files* in parallel that threw the errors listed above. Did not get any "is locked" errors, so I think this solves the problem. Though given that @KastanDay didn't see any of these errors when he ran the code, it's possible that these write conflicts happen rarely. We can re-open this issue if it comes up again.

*Staged: 448 files (24.73 GB) from the /home/pdg/data/ice-wedge-polygon-data/version 01/alaska/207_208_209_223_224_iwp directory on datateam.

KastanDay commented 2 years ago

Awesome. That seems like a more or less ideal solution. I'll do larger testing ASAP (i.e. this week) to see if I can reproduce this bug.