ContactEngineering / SurfaceTopography

Read and analyze surface topographies
https://contactengineering.github.io/SurfaceTopography/
MIT License
15 stars 9 forks source link

Wrapped readers should not parse file twice #340

Open pastewka opened 9 months ago

pastewka commented 9 months ago

The ASC and Matrix readers currently parse the file twice: Once when constructing the reader (and the channel information) and second when actually reading the data. The second read step is unnecessary and should be avoided.

Note that readers should not store the topography data when constructing the channel information to avoid huge memory usage in Topobank. However, files are now only touched in Celery workers such that this problem is alleviated.

This means turning these wrapped readers into new-style readers.

pastewka commented 1 month ago

We should just keep data in memory when reading for the first time. The reason this should be avoided used to be that topobank opened files in the web server - it now exclusely opens files in a Celery task. The reader architecture is still useful for MPI parallel calculations, but in that case text files are unlikely to be used as inputs.