Closed hagenw closed 5 months ago
The only concern I have is about dependencies: is the code depending on a newer version of audbackend already? I see not changes in the pyproject.toml.
No, this does not yet depend on a newer audbackend
version (version 2.0.0 is also not released yet, but consists only in the dev
branch of audbackend
). I will prepare a pull request for testing audbackend
2.0.0 after the caching speed is handled to avoid merge conflicts.
When caching
audbcards.Dataset
we store objects that are not needed to create a datacard, e.g. the dependency table and header of a dataset. This increases the size of the cache and makes loading slower than it is needed. This pull request speeds up caching ofaudbcards.Dataset
by pickling only cached properties, as listed byaudbcards.Dataset._cached_properties()
(formerlyaudbcards.Dataset.properties()
).The execution time for building our database overview page is as follows on compute5:
The size of the cache is reduced from 2.6G to 133M.
We can further improve execution time by also caching the images / audio examples from
audbcards.Datacard
, but I will handle this in a follow up pull request.Further changes:
audbcards.Dataset.properties()
toaudbcards.Dataset._cached_properties()
audbcards.Dataset.schemes_summary
, that holds entries needed for the dataset overview pageaudbcards.Dataset.cache_root
attributeaudbcards.Dataset.deps
andaudbcards.Dataset.header
to properties, and added them to the documentationaudbcards.Dataset.backend
andaudbcards.Dataset.repository_object
properties__getstate__
and__setstate__
methods to thedohq_artifactory.GenericRepository
object, as the repository is no longer pickledNewly added API entries: