dnvgl / qats

Python library and GUI for efficient processing and visualization of time series.
MIT License
38 stars 10 forks source link

Performance issues with TsDB load in 4.8.1 #82

Closed haakoly closed 3 years ago

haakoly commented 4 years ago

When updating to the 4.8.1 version, loading of tsdm files seems very slow compared to previous version. When using exact same script and data (loading a time series with 130 000 data points from a file with 20 time series of same length) the load time goes from 40 seconds to about 160 seconds. Same issue can be replicated for other time series aswell.

Issue can only be replicated when read=True for the load function: 4.8.1 Read=True: 168 s Read=False: 19.33 s

4.8.0 Read=True: 40 s Read=False: 18.92 s

db = TsDB() \ db.load( file_path, read=True ) \ t,X = db.geta(name=n_name) \

eneelo commented 4 years ago

Thanks for reporting. Could you please provide some details regarding the python environments used, in particular the versions of python and npTDMS? Also, are these versions identical both for the 4.8.0 and 4.8.1 run of qats reported above?

haakoly commented 4 years ago

The project and enviorment is identical for both runs of the different qats versions. The only difference is an upgrade from 4.8.0 to 4.8.1.

Python==3.7 npTDMS==0.27.0

tovop commented 3 years ago

Reason for changes between versions 4.8.0 and 4.8.1

The reason for the changes between versions 4.8.0 and 4.8.1 are that the following functions/properties used in version 4.8.0 are deprecated:

Profiling

General

I have profiled the qats.readers.tdms.read_names() and qats.readers.tdms.read_data() and it is challenging to replicate the 2x difference in speed between version 4.8.1 and 4.8.0 reported in the issue. However it could be that for very large datasets the relative speed increases. Anyway I have refactored the code to gain som efficiency, approximately 32%.

Profiling version 4.8.1,

image

Profiling new version

image