The FileAgentPluginRepository has outlived its usefulness. It is not capable of supporting repeated querying of certain plugin components in a performant way. This component should be replaced with a new MongoAgentPluginRepository. Note that most (all?) of the unit tests for FileAgentPluginRepository may be easily repurposed to test the new repository, since both repositories will share the same interface.
The new repository should store AgentPlugin objects as documents in MongoDB.
Note
Unlike the current repository which will load the plugins from files and parse linux/windows-specific plugins from the plugin package, this repository will not need to do any parsing at retrieval time. Rather, this parsing should be done when the plugins are stored.
Tasks
[x] Implement MongoAgentPluginRepository from IAgentPluginRepository (0d) @cakekoa
[x] Figure out the implications of storing binary data in mongo
[x] store_agent_plugin
[x] remove_agent_plugin
[x] get_plugin
[x] get_all_plugin_configuration_schemas
[x] get_all_plugin_manifests
[x] Build the AgentPluginService with the mongo repository, instead of the file-based repository (Do not use the caching decorator) (0d) - @shreyamalviya
[x] Remove disused code (0d) - @shreyamalviya
[x] Remove the disused file-based repository
[x] Remove the disused caching decorator
[ ] Update server setup to install (IAgentPluginSevice.install_agent_plugin) all plugins (0d) - @shreyamalviya
[x] Use the data_dir convention (data_dir / PLUGIN_DIR_NAME)
Most negatives associated with storing binary data in Mongo deals with replication an backups: all that data will need to be copied, and it takes time and space
Another negative is that it could push other documents out of memory, affecting the performance of the database
If we need to store more than 16MB, we can use GridFS to store the data
GridFS adds a bit of overhead compared to storing the data in the document
Binary data stored in GridFS is immutable. This is not a problem for us since we will only create, read, and delete (no need to update)
GridFS does not support multi-document transactions*, however, search operations can iterate over multiple documents
GridFS uses a chunk size of 255KB by default, and the last chunk is only as large as necessary. This can be configured. See pymongo’s GridFSBucket
GridFS stores binary files, so it cannot be queried like a regular document store. The full file would have to be read before it can be queried. However, one can add metadata to it. Apparently indexes
Potential approach: Store the binary data in GridFS, and store non-binary fields in metadata (non-generalizable)
Potential approach: Store the document in GridFS, and add the non-binary fields to metadata (generalizable, easy)
Alternative approaches:
Store the metadata and binary data into different collections (potentially GridFS)
Store the metadata in mongo, and store the binary data in the filesystem (potentially faster)
Description
The
FileAgentPluginRepository
has outlived its usefulness. It is not capable of supporting repeated querying of certain plugin components in a performant way. This component should be replaced with a newMongoAgentPluginRepository
. Note that most (all?) of the unit tests forFileAgentPluginRepository
may be easily repurposed to test the new repository, since both repositories will share the same interface.The new repository should store
AgentPlugin
objects as documents in MongoDB.Note
Unlike the current repository which will load the plugins from files and parse linux/windows-specific plugins from the plugin package, this repository will not need to do any parsing at retrieval time. Rather, this parsing should be done when the plugins are stored.
Tasks
MongoAgentPluginRepository
fromIAgentPluginRepository
(0d) @cakekoaAgentPluginService
with the mongo repository, instead of the file-based repository (Do not use the caching decorator) (0d) - @shreyamalviyaIAgentPluginSevice.install_agent_plugin
) all plugins (0d) - @shreyamalviya