aiidateam / disk-objectstore

An implementation of an efficient "object store" (actually, a key-value store) writing files on disk and not requiring a running server
https://disk-objectstore.readthedocs.io
MIT License
15 stars 8 forks source link

Improve import time by moving `sqlalchemy` imports within function #105

Closed sphuber closed 3 years ago

sphuber commented 3 years ago

This was discovered as a result of this library being used in aiida-core, whose CLI command verdi increased in loading time significantly once it started to import disk_objectstore.

Profiling pointed to sqlalchemy being the culprit. By moving the imports from the module level to within the function scope, the loading time is reduced from 0.11 seconds on average to about 0.03 seconds. This reduction by 3 to 4 times is significant.

codecov[bot] commented 3 years ago

Codecov Report

Merging #105 (0ba8bf7) into develop (d786296) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##           develop      #105   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            7         7           
  Lines         1541      1552   +11     
=========================================
+ Hits          1541      1552   +11     
Impacted Files Coverage Δ
disk_objectstore/container.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d786296...0ba8bf7. Read the comment docs.

giovannipizzi commented 3 years ago

Also, I'm quite confident we'll forget and add an import outside functions again :-D I guess that's ok for now (I don't plan major changes to this lib in the short term), but if this happens again we should probably add some speed test to this library as well

chrisjsewell commented 3 years ago

Also to clarify, it's not possible to move the import of this package in aiida-core, such that it not imported on verdi calls?

sphuber commented 3 years ago

Also to clarify, it's not possible to move the import of this package in aiida-core, such that it not imported on verdi calls?

I am doing this anyway, but I thought might as well make the import of this library faster as well anyway, no?

sphuber commented 3 years ago

Is the dependency pin on sqlalchemy just due to the support of python versions, or is there any other reason?

This is because the code is not compatible with 1.4, at least the tests were failing when installing 1.4

chrisjsewell commented 3 years ago

Well its not really standard practice is it and kinda violates https://www.python.org/dev/peps/pep-0008/ 😬

sphuber commented 3 years ago

Sure, but if it decreases performance of load time by a factor of 4 I would say we can start to consider breaking the standard rule. Not sure if this case is explicitly described in PEP8

chrisjsewell commented 3 years ago

Well if you are importing the package, then it usually because you want to actually use it, or if you didn't want to import everything when importing the top-level, you would just remove the install of container in the top-level __init__ and have people specifically install components from that module. But I won't block this change if you really want it

sphuber commented 3 years ago

I am actually fine with not doing this change, since I have already fixed it in aiida-core which I find most important since there we really need the app to load fast just on import. So fine by me to keep this as is and close the PR. I have opened two others anyway to address the sqlalchemy dependency problem and Python support

giovannipizzi commented 3 years ago

Ok, I'm going to close this then - we can reopen if the problem turns out again, or if after further discussion we decide we want to do it