Importing airr takes 6.9 seconds whereas it should takes tenths of a second.
(airr_standards) [master][~/Documents/new/airr-standards.airr/lang/python]$ python
Python 3.10.4 (main, May 30 2022, 12:51:07) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> def time_import_airr():
... start_time = time.time()
... import airr
... end_time = time.time()
... print(f"Import took {end_time - start_time} seconds")
...
>>> time_import_airr()
Import took 6.869977951049805 seconds
This slows down applications using it and discourages the airr library from being adopted.
This occurs because of how the schema is instantiated in schema.py. The Schema class is instantiated many times in the AIRRSchema dict. As part of this instantiated, the same airr-schema.yaml file is loaded multiple times:
else:
with resource_stream(__name__, 'specs/airr-schema.yaml') as f:
spec = yaml.load(f, Loader=yamlordereddictloader.Loader)
Instead, this should be done once in the file and then called during class instantiation.
With this logic, the import time for airr now takes 0.4 seconds.
(airr_standards) *[master][~/Documents/new/airr-standards.airr/lang/python]$ python
Python 3.10.4 (main, May 30 2022, 12:51:07) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> def time_import_airr():
... start_time = time.time()
... import airr
... end_time = time.time()
... print(f"Import took {end_time - start_time} seconds")
...
>>> time_import_airr()
Import took 0.4411776065826416 seconds
TLDR: airr takes a long time to import. Fix is here https://github.com/airr-community/airr-standards/pull/683
Importing airr takes 6.9 seconds whereas it should takes tenths of a second.
This slows down applications using it and discourages the airr library from being adopted.
This occurs because of how the schema is instantiated in
schema.py
. TheSchema
class is instantiated many times in theAIRRSchema
dict. As part of this instantiated, the sameairr-schema.yaml
file is loaded multiple times:Instead, this should be done once in the file and then called during class instantiation.
With this logic, the import time for airr now takes 0.4 seconds.
I implemented this logic in PR SyntenyBio:jday1/682-fix https://github.com/airr-community/airr-standards/pull/683