Open ReikanYsora opened 1 year ago
Does you data look like in #29? I.e. your dataset is 8261 x 361 (ok) but your chunk size 8261 x 1 (bad for reading)?
PureHDF does the following when reading a dataset with a chunk size like this:
Row 1
:
... go on until the first rows has been fully read
Row 2
:
Row x
:
...
If the chunk cache is large enough, the performance might still be OK but if the cache is too small, then all the chunks have to be read and decompressed over and over which is a performance nightmare.
So the root cause most likely lies in the chunk layout (8261 x 1). If the chunk size would be transposed, i.e. 1 x 8261, your data would be read blazingly fast (but there is always a disadvantage: writing the data would become more expensive).
In the example above, without any chunk cache, there would be 8261 x 361 = 2.982.221 single read operations and each operation reads and decodes a full chunk. So this as bad as it could be. More information here: https://github.com/Apollo3zehn/PureHDF/issues/17#issuecomment-1403809255
You can reduce the number of unnecessary chunk reads by using a custom selection (DelegateSelection
) as shown here:
using PureHDF;
var root = H5File.OpenRead("my file path");
var dataset = root.Dataset("my dataset");
var rank = dataset.Space.Dimensions.Length;
/* dataset dimensions */
var rows = (uint)dataset.Space.Dimensions[0];
var columns = (uint)dataset.Space.Dimensions[1];
/* chunk dimensions */
var chunkRows = (uint)dataset.Layout.ChunkDimensions[0];
var chunkColumns = (uint)dataset.Layout.ChunkDimensions[1];
// dataset (source)
IEnumerable<Step> SourceWalker(ulong[] limits)
{
var coordinates = new ulong[rank]; // reuse array to reduce GC pressure
for (uint i = 0; i < columns; i += chunkColumns)
{
coordinates[0] = 0;
coordinates[1] = i;
yield return new Step()
{
Coordinates = coordinates,
ElementCount = rows
};
}
}
var totalElementCount = dataset.Space.Dimensions.Aggregate(1UL, (x, y) => x * y);
var fileSelection = new DelegateSelection(totalElementCount, SourceWalker);
// memory (target)
IEnumerable<Step> TargetWalker(ulong[] limits)
{
var coordinates = new ulong[rank]; // reuse array to reduce GC pressure
for (uint row = 0; row < rows; row++)
{
for (uint column = 0; column < columns; column += chunkColumns)
{
coordinates[0] = row;
coordinates[1] = column;
yield return new Step()
{
Coordinates = coordinates,
ElementCount = chunkColumns
};
}
}
}
var memorySelection = new DelegateSelection(totalElementCount, TargetWalker);
// read
var memoryDims = new ulong[] { rows, columns };
var result = dataset
.Read<double>(
fileSelection: fileSelection,
memorySelection: memorySelection,
memoryDims: memoryDims
)
.ToArray2D(rows, columns);
This should speed up your read operation. There are further optimizations possible. For example, right now, the target walker (which tells PureHDF how to fill the target array) is quite expensive. If the code above is not enough to meet your performance requirements, the target walker may be ommitted but then your target array contains data in the wrong order. This array then needs to be reordered manually, which is probably a little bit faster than using the TargetWalker
implementation from above.
Thanks for your answer !
Unfortunately, this solution takes even longer to read. Indeed, the targetWalker looks extremely heavy. With this solution reading time is about multiplied by 3
Is it faster, when you remove the memorySelection
parameter from the Read()
method? Then your data will be in the wrong order but at least we know where to optimize.
Could you please send a screenshot of the dataset properties opened in HDF View as you did in #29? Or is it exactly the same? I am mainly interested in the chunk dimensions but also other information may be helpful.
I have tried to reproduce the problem with the sample file you sent me earlier but it is too small to find the bottleneck. It just loads too fast.
Hi @Apollo3zehn . I've recently come back to a project that uses this library and ran into what I think was this issue when trying to upgrade from a much older version (when the project was still HDF5.NET) to the most recent version.
I eventually managed to track it down to being caused only for .NET 6+ in between the commits ed7b34a and ed25d6a (mostly by accidentally using .NET 5 once in a test, and it suddenly speeding up again). It looks like H5SafeFileHandleReader
can be significantly slower than H5StreamReader
; the exact slowdown seems to depend heavily on the shape of the data, varying from basically nothing to being ~25x slower for various datasets I was testing against.
I found that using a memory mapped file, as explained here, did the trick, as I believe that then causes PureHDF to use H5StreamReader
again? I kind of assume most people use H5File.OpenRead(string)
as the 'default' way of using PureHDF, so it might be a bit of a problem that it can cause such a large difference in performance.
Hope that's useful information, and thanks for your work on this, the library's been really useful. Can supply the datasets I used for testing if needed.
A test dataset would be great :-)
I was not aware of the performance issues. I am planning to integrate more Benchmarks before the final release.
Thank for your investigations!
The dataset in each of the files is at the path /raw_data_1/detector_1/counts
. They all use chunk layouts with a size of 1 x 8 x \<length of the array>.
File | array dimensions | Approximate Read time with H5StreamReader |
Approximate Read time with H5SafeFileHandleReader |
---|---|---|---|
ALF | 1 x 2386 x 1361 | 30-60ms | 500-ish ms |
INS | 1 x 168 x 17250 | 50ms | 50ms (no change) |
SXD | 1 x 45100 x 1821 | 0.5 seconds | 10-15 seconds |
Hello,
I'm creating this "issue" to try and find out how to improve read performance for a complete dataset.
At the moment, I sometimes have to load fairly large files (around 800 MB), and it can take up to twenty minutes to read a complete dataset, even when trying to tweek buffer and chunk sizes in PureHDF.
Do you have any other ideas on how I can improve performance? I can't use Multi-Threading or asynchronism (my project uses Unity3D and therefore .Net Standard 2.1).
Thanks in advance !