tbeu / matio

MATLAB MAT File I/O Library
https://matio.sourceforge.io
BSD 2-Clause "Simplified" License
330 stars 97 forks source link

Introduce predicated iteration functions #197

Closed Ri0n closed 11 months ago

Ri0n commented 1 year ago

Reading a Matio variable from an HDF5 files sometimes can be too slow. Besides it's often necessary to iterate over all variable to read just one. This takes a lot of time. This change allows to skip unnecessary readings

tbeu commented 1 year ago

@Ri0n Can you please force-push again to trigger the Travis CI pipeline. Thanks.

coveralls commented 1 year ago

Coverage Status

Coverage: 82.514% (+0.1%) from 82.399% when pulling 4df929a06598c8914093ae0bed9f75f838d8a52b on Ri0n:optimized-iterations into ae8a9bddb0485e9b60db296cfc1c9588a472a0cd on tbeu:master.

coveralls commented 1 year ago

Coverage Status

Coverage: 87.883% (+5.5%) from 82.399% when pulling 29a2dabf57ed8d8963e1fd1c6e364cc6ffd501fd on Ri0n:optimized-iterations into ae8a9bddb0485e9b60db296cfc1c9588a472a0cd on tbeu:master.

tbeu commented 1 year ago

If you want to you can add yourself as contributor to section 1.2 of both files README.md and README.

Ri0n commented 1 year ago

Wrt performance. There was a problem with one file. And this PR is an alternative fix to one merged in https://github.com/tbeu/matio/pull/198 In both cases file loading time was reduced from 2 minutes to 5-7 seconds.

The crux of the problem is in long string string members in header variable. IIRC 5+ strings. The strings themselves were quite short but declared at 1000+ symbols. Each symbol was resolved as a variable with H5Iget_name somewhere on the way, making all the reading extremely slow.

One way to solve this was to read header directly with HDF5 and use this PR's fix to read everything else but header with help of matio. And basically it's what currently applied in our project.

tbeu commented 1 year ago

Thanks for the numbers and update.

Do you suggest instead to not merge this PR after #198 got merged?

Ri0n commented 1 year ago

I think both together are good to raise performance to its possible maximum.

tbeu commented 1 year ago

Argh, I missed that #198 failed the test suite in 18 tests, see https://app.travis-ci.com/github/tbeu/matio/jobs/593339008#L5197

Unfortunately, this is hidden and not marked as failure by Travis CI.

Ri0n commented 1 year ago

Argh, I missed that https://github.com/tbeu/matio/pull/198 failed the test suite in 18 tests, see https://app.travis-ci.com/github/tbeu/matio/jobs/593339008#L5197

That what I was afraid of. It touches too many places.

tbeu commented 1 year ago

Never mind. Do you think you are able to fix the tests?

Ri0n commented 1 year ago

Probably I can fix it with some guidance. Unfortunately it's no more a priority task at my job since an alternate fix exists. So I need to invest my own time which I'm always lack of. In any case I can do nothing till the end of the week.

tbeu commented 11 months ago

CI succeeds now.