NeurodataWithoutBorders / lindi

Linked Data Interface (LINDI) - cloud-friendly access to NWB data
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

LocalCache: limit allowed blob size #69

Closed magland closed 1 month ago

magland commented 1 month ago

I found a situation where the chunk size was larger than 1 GB and the sqllite save failed. This particular dataset was not chunked, and this should not be an issue for most dandisets where chunk sizes are reasonable.

So with this PR I limit the size of chunks in local cache to 900 MB.

See https://www.sqlite.org/limits.html

(For some reason, despite what it says in that limits.html, it seems that 1000 MB doesn't work whereas 900 MB does)

For LindiH5ZarrStore and LindiReferenceFileSystemStore, I display a warning if the chunk size is too large, and then skip storing it in local cache. For LindiRemfile I raise an exception because this should never happen in that context.

I added a test.

rly commented 1 month ago

Since the 900 MB limit is specific to the sqlite cache and I would like to keep the magic numbers in one place rather than four, could you move the if statement to LocalCacheSQLiteClient.put_remote_chunk and raise an exception there if it is too big? We can catch the exception and raise a warning instead in LindiH5ZarrStore and LindiReferenceFileSystemStore.

I'm also happy to refactor that.

magland commented 1 month ago

Since the 900 MB limit is specific to the sqlite cache and I would like to keep the magic numbers in one place rather than four, could you move the if statement to LocalCacheSQLiteClient.put_remote_chunk and raise an exception there if it is too big? We can catch the exception and raise a warning instead in LindiH5ZarrStore and LindiReferenceFileSystemStore.

I'm also happy to refactor that.

Sounds good. I'll work on that.

codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 50.00000% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 79.41%. Comparing base (797869d) to head (1535202).

Files Patch % Lines
lindi/LindiH5ZarrStore/LindiH5ZarrStore.py 20.00% 4 Missing :warning:
...ndi/LindiH5pyFile/LindiReferenceFileSystemStore.py 20.00% 4 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #69 +/- ## ========================================== - Coverage 79.58% 79.41% -0.17% ========================================== Files 30 30 Lines 2243 2254 +11 ========================================== + Hits 1785 1790 +5 - Misses 458 464 +6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

magland commented 1 month ago

@rly Is this what you had in mind?

rly commented 1 month ago

Yes, awesome. Thanks for the change!

magland commented 1 month ago

on pypi as 0.3.7