EMMC-ASBL / oteapi-core

OTEAPI core components
https://EMMC-ASBL.github.io/oteapi-core
MIT License
7 stars 4 forks source link

Lazy strategy loading #21

Closed CasperWA closed 2 years ago

CasperWA commented 2 years ago

The following is a lazy loading implementation suggestion (it came up in a discussion between me and @jesper-friis).

If the setuptools' entry_points values focus on the strategy schemes instead of the strategy types, it may allow lazy loading of the modules hosting the strategies.

Instead of the current syntax of the entry_points (as defined in a setup.cfg file):

[options.entry_points]
oteapi.download_strategy =
  core.file = oteapi.strategies.download.file
  core.https = oteapi.strategies.download.https
  core.sftp = oteapi.strategies.download.sftp
oteapi.filter_strategy =
  core.crop = oteapi.strategies.filter.crop_filter
  # ...

It can be changed to something like:

[options.entry_points]
oteapi.scheme =
  http = oteapi.strategies.download.https
  https = oteapi.strategies.download.https
oteapi.mediaType =
  image.jpg = oteapi.strategies.parse.image_jpeg
  image.jpeg = oteapi.strategies.parse.image_jpeg
  image.j2p = oteapi.strategies.parse.image_jpeg
  # ...

This will make it possible for the plugins to declare what scheme/mediaType+value they are implementing strategies for (as well as where to find the strategy implementations).

Now, when loading the OTE-API application, the entry points group (e.g., oteapi.scheme) + name (e.g., https) uniquely defines a valid value for StrategyFactory.register(). Hence, with the proper Python API for loading the entry points, the actual module import (in the given example of oteapi.strategies.download.https) can be postponed until the (scheme, https)-strategy is requested.

Since importlib.import_module refers back to built-in functions that always check whether the module exists in sys.modules before loading it, calling importlib.import_module for the same module several times is not expensive, and will merely return the cached module.

Finally, to ease development for plugin-developers, this list of entry points can be generated from functionality existing in oteapi-core (not yet written). The functionality could be implemented as part of the plugin repository's setup.py or as a pre-commit hook - or both. In this way, it should be easier for plugin-developers to get started. It's understood that the functionality delivered from the side of oteapi-core should include ways of excluding certain strategy implementations to be findable, etc. Essentially achieving the same amount of control currently available through the manual addition of entry_points entries.

CasperWA commented 2 years ago

Thinking about it again - there might be an issue concerning stability. Meaning, if all strategies/modules are imported upon app startup, any missing imports or initialization issues will be caught immediately. While this simple sanity check will not be checked until requesting a specific strategy for the lazy loading approach, ending up in a situation where the app will start fine, a user will request a certain external strategy, the module will be loaded and fails, having the app server fail.

Of course this can be remedied in the app (oteapi-services) by using proper exception handlers and proper custom exceptions here in oteapi-core. But it's at least something to consider.

CasperWA commented 2 years ago

Discussing further, we were considering what we were losing by going for the new entry_points content.

  1. The strategy type will become unknown.
  2. The entry points are not package unique and may be unknowingly overwritten by other plugin packages.

The 1st point is not true, looking at the strategy scheme values, they are unique with respect to the strategy type. E.g., scheme for download types, and filterType for filter types.

For the 2nd point - this is true. To mitigate this, we suggest adding back the Python package namespace as the initial entry point name value. I.e., the example becomes:

[options.entry_points]
oteapi.scheme =
  oteapi.http = oteapi.strategies.download.https
  oteapi.https = oteapi.strategies.download.https
oteapi.mediaType =
  oteapi.image.jpg = oteapi.strategies.parse.image_jpeg
  oteapi.image.jpeg = oteapi.strategies.parse.image_jpeg
  oteapi.image.j2p = oteapi.strategies.parse.image_jpeg
  # ...

Where the entry point's name now becomes: <python importable package name>.<strategy name>. Where <strategy name> will have its periods (.) exchanged by forward slashes (/) when registering.