ethyca / fidesops

Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.
https://ethyca.github.io/fidesops
Apache License 2.0
48 stars 16 forks source link

Consider refactoring factories for increased extensibility #562

Closed adamsachs closed 2 years ago

adamsachs commented 2 years ago

Is your feature request related to a specific problem?

When attempting to extend the fidesops masking framework outside of the core fidesops repo, it became clear that the existing framework made this difficult to do cleanly - in order to register one's "custom" MaskingStrategy subclass in the central masking_strategy_factory, one had to make code updates to the masking_strategy_factory file/class. https://github.com/ethyca/fidesops/issues/547 (and corresponding PR https://github.com/ethyca/fidesops/pull/560) refactored the masking_strategy_factory to provide a more extensible framework.

This feature request is to consider implementing a similar strategy for other factory-like frameworks (is there a more precise term for this?) within fidesops, if they may benefit from improved extensibility.

Impact Will make it much easier to add connectors because currently any changes that have to be done have to be implemented on core and this ticket will allow us to work outside of core.

Describe the solution you'd like

The decorator-based registration that was implemented for masking strategies in https://github.com/ethyca/fidesops/pull/560 seems to provide an extensible framework that's not burdensome on developers who wish to extend with more implementations (all that's needed is a simple and concise decorator on the extending subclass), while also providing for indexed lookups of subclasses (rather than iterating through a base class's .__subclasses__(), for example). The approach also seems relatively transparent and clear to a developer who wants to extend with their own subclass.

Note that with this approach, any extending subclasses will need to be loaded/imported, e.g. via an __init__.py -- in the existing frameworks, the import is necessarily done when the subclasses are explicitly referenced in the "registry" enums.

A quick overview of the current fidesops codebase points to these frameworks as potential targets of this enhancement:

Describe alternatives you've considered, if any

In some relatively brief research, I came across some other approaches for implementing the factory pattern with dynamic registries that can be extended without modifying the core factory code:

Additional context

adamsachs commented 2 years ago

noting that in implementation, we actually decided to use the "python builtin .subclasses() method" listed as an alternative. we chose this because it was a simpler implementation that relied more on builtin functionality.

that being said, the major tradeoff is that it incurs a linear lookup time for a given strategy, where the scale factor is the number of strategy subclasses of the strategy type that's being looked up. at this point, we expect that number to be trivial, but it may need to be re-examined if the number of strategy subclasses we've got in the system grows dramatically.