Closed mattporritt closed 7 years ago
I need to do some more research here. Sub plugins for search may require a core patch, but I need to confirm.
I think overall we need to go to a sub plugin architecture here. When we do it will be a compromise of technical debt vs effort now. I haven't decided which yet.
I think a good approach here is to have a range of plugins that declare to the parent which file mine types they support. Then file indexing is handled by the appropriate enabled sub plugin.
Eventually I would like to see the following sub plugins as a start:
no op - plain text files don't need conversion, just need to get content as a string. Code already exists in the plugin to do this, just need to refactor it out external Tika - extract content of files using an external Tika service. Code already exists in the plugin to do this, just need to refactor it out Aws rekognition - image file content is extracted using AWS AI as a service. Code already exists in the plugin to do this, just need to refactor it out Elasticsearch Tika - file content is converted to base64 ready for sending to Elasticsearch instances with the ingest plugin. TODO Moodle doc converter - files are sent to the core Moodle conversion API for text extraction. TODO
yep core mod required to support subplugins... It is a onliner in lib/classes/component.lib though
/** @var array list plugin types that support subplugins, do not add more here unless absolutely necessary */
protected static $supportsubplugins = array('mod', 'editor', 'tool', 'local');
would need to become
protected static $supportsubplugins = array('mod', 'editor', 'tool', 'local');
I assume you meant to change the line to
protected static $supportsubplugins = array('mod', 'editor', 'tool', 'local', 'search');
Hacking this line in core just to run the search plugin might be tricky for some admins, but perhaps it would also be accepted for core if there are valid reasons for subplugins in search plugins.
However, I'm wonder if the subplugins you have mentioned (Tika, AWS recognition) really have to be subplugins of the search plugin. If I understand it correctly, search_elastic is for searching and indexing content while Tika, AWS recognition are for extracting content from files. As you are already aligning the Moodle doc converter with Tika and AWS recognition in your list, wouldn't it make sense to implement Tika and AWS recognition as a file converter plugin instead of a subplugin of the search plugin?
Thanks, Alex
I'm closing this as a won't fix. Sub plugins are not the way to go here.
Instead will use file converter API that arrived in Moodle in 3.3: https://docs.moodle.org/dev/File_Converters The features I've mentioned will be implemented as file converter plugins instead.
I've created Issue #25 for this work
Refactor plugin to use a subplugin architecture. The Tika text extraction and AWS Rekognition Image rekognition features should be sub plugins. This would make it easier to manage and create integrations to other services. For example to use Google's image recognition service instead of AWS