JuliaIO / JLD2.jl

HDF5-compatible file format in pure Julia
Other
537 stars 84 forks source link

Single file multi-threading (read-only) #477

Closed ejmeitz closed 11 months ago

ejmeitz commented 12 months ago

This PR aims to add support for multi-threaded read-only access to JLD2 files. I created a new global tracker OPEN_PARALLEL_FILES which keeps track of the files flagged as open for multi-threaded access. The flag for multi-threaded access was added to the jldopen function as parallel_read with a default of false. If the user changes this flag to true the program will check that the file is opened with in mode r and that it is not already present in the OPEN_FILES list. If these checks are satisfied jldopen proceeds as normal. The OPEN_FILES_LOCK is not invoked when parallel_read is true.

This is in reference to #403. Still need to add tests, but the existing tests pass locally.

Things I am unsure of:

codecov[bot] commented 12 months ago

Codecov Report

Patch coverage: 15.11% and project coverage change: -8.90% :warning:

Comparison is base (0198f19) 86.06% compared to head (1e99703) 77.16%. Report is 1 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #477 +/- ## ========================================== - Coverage 86.06% 77.16% -8.90% ========================================== Files 31 32 +1 Lines 4170 4704 +534 ========================================== + Hits 3589 3630 +41 - Misses 581 1074 +493 ``` | [Files Changed](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO) | Coverage Δ | | |---|---|---| | [src/data/type\_defs.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL2RhdGEvdHlwZV9kZWZzLmps) | `100.00% <ø> (ø)` | | | [src/metadataview.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL21ldGFkYXRhdmlldy5qbA==) | `0.00% <0.00%> (ø)` | | | [src/datasets.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL2RhdGFzZXRzLmps) | `91.36% <50.00%> (ø)` | | | [src/data/reconstructing\_datatypes.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL2RhdGEvcmVjb25zdHJ1Y3RpbmdfZGF0YXR5cGVzLmps) | `80.79% <60.34%> (-1.10%)` | :arrow_down: | | [src/JLD2.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL0pMRDIuamw=) | `89.67% <95.45%> (+0.19%)` | :arrow_up: | | [src/backwards\_compatibility.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL2JhY2t3YXJkc19jb21wYXRpYmlsaXR5Lmps) | `36.63% <100.00%> (ø)` | | | [src/data/writing\_datatypes.jl](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO#diff-c3JjL2RhdGEvd3JpdGluZ19kYXRhdHlwZXMuamw=) | `96.55% <100.00%> (ø)` | | ... and [2 files with indirect coverage changes](https://app.codecov.io/gh/JuliaIO/JLD2.jl/pull/477/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaIO)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ejmeitz commented 12 months ago

Makes sense, that shouldn't be too hard to remedy.

ejmeitz commented 12 months ago

Well the code definitely makes a new handle every time if parallel_read is true but I am still clueless as to how I can test this does what I think it does.

JonasIsensee commented 12 months ago

Why the BigInt tests fails on two of the architectures in the CI (passes locally for me) Don't worry about this one. It happens sporadically. I'm not sure why.

JonasIsensee commented 12 months ago

Thank you for making the changes!

That is fine - I don't think we can properly test this through CI. I'll have a final close look at the code and test it locally.

Looking at the code now, I don't think I like the "OPEN_PARALLEL_FILES" list. I think we're better off without it.

Also, you (probably accidentally) committed a manifest and a .jld2 file.

ejmeitz commented 12 months ago
JonasIsensee commented 12 months ago
  • I'll try and test it out as well. I have some of the banned use cases in tests but not actual parallel reads.

:+1:

* I left the `OPEN_PARALLEL_FILES` to check that case you mentioned of someone trying to open the same file elsewhere with `parallel_read` set to false and then writing to that file.

I understand and appreciate the intention. I'd also support it if it were that simple. Imagine the following test-case.

fp = jldopen(testfile; parallel_read=true)
close(fp) # probably (already) fails because it can't be deleted from the OPEN_FILES Dict

fs = jldopen(testfile; parallel_read=false) # fails because OPEN_PARALLEL_FILES has an entry (that may be WeakRef(nothing))

Even harder to fix:

fp1 = jldopen(testfile; parallel_read=true)
fp2 = jldopen(testfile; parallel_read=true) # overwrites reference tot fp1
close(fp2)

# If you fix the case above, JLD2 would now still think that there is no open parallel file handle.
fs = jldopen(tesfile; parallel_read=false) # should fail according to current logic but doesn't  

Since this is quite a bit of work (Would need to keep a full list of ALL open file handles and also check whether they are still open) I would prefer removing this safety check and add a warning in the docs instead.

* Yeah the manifest is from running the tests locally oops. It won't let me delete the test.jld2 file I committed for some reason.

That's fine. I can fix that in the end.

JonasIsensee commented 11 months ago

Hi @ejmeitz,

thanks for helping out! I had a few minutes to spare and took the liberty to finish this up real quick.

I relaxed the open conditions. (parallel and serial read is fine) Will merge and release once tests pass.