LiberTEM / LiberTEM

Open pixelated STEM framework
https://libertem.github.io/LiberTEM/
GNU General Public License v3.0
111 stars 68 forks source link

DM3/DM4 reader: improve initialization performance #501

Open sk1p opened 4 years ago

sk1p commented 4 years ago

As a follow-up to #291, we may want to try to improve the initialization performance of our reader. By executing the metadata reading for many files concurrently, we already got the time down to ~20 seconds for a 2.5k data set, from more than two minutes (IIRC). This can be further improved if there is need, for example for even larger data sets.

Possible improvement, roughly ordered by increasing difficulty:

sk1p commented 4 years ago

Related to offset guessing: can also read the correct offset from the first file, then validate for the rest of files, and fall back to the slow case if validation fails. Validation can be done by storing offsets for arraySize, arrayOffset, DataType tags from the first file and checking if the values of the tags at these offsets are the same.