Open valeriupredoi opened 6 months ago
CPU:
*-cpu
product: Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz
vendor: Intel Corp.
vendor_id: GenuineIntel
physical id: 1
bus info: cpu@0
version: 6.58.0
width: 64 bits
Result is 4677.8594 (stable)
Kerchunk indexing and JSON file writing times:
@bnlawrence suggests chunking, and he is correct: 2.8G file field has 30 chunks, the other field has 3400 chunks -> here's the penalty factor right there!
-def
file (64 HDF5 chunks)My computer (UoR network etc)
Kerchunk indexing and JSON file writing times:
Time before going into Reductionist:
Time before going into Reductionist:
so it's starting to look like this:
Kerchunk indexer:
To (network) and at Reductionist
Total time
Local tests on V Computer
Test code:
Kerchunk is restricted to
Dataset
of interest:Chunks
Both Kerchunk and Pyfive send variable (give or take 5 or 10) numbers of chunks to Reductionist; order of magnitude is 3360 chunks.
Kerchunk-based Pipeline
Result is 4677.8594 (stable)
Kerchunk indexing and JSON file writing times:
Pyfive-based pipeline
Result is 4677.8594 (stable)
Sliced Kerchunk (slice
[0:3, 4:6, 7:9]
)Sliced Pyfive (slice
[0:3, 4:6, 7:9]
)