This PR onboards an out-of-the-box neural sparse search workflow configuration by adding a base template, the base set of workspace nodes/edges, and updated/refactored logic in the parsing to produce a usable workflow template that can provision an ingest pipeline and index (and optionally a pretrained neural sparse model). More specifically:
adds OOTB sparse encoder pretrained models and adds them as options in ModelField
adds SparseEncoderTransformer ML transformer component
minor refactoring in Ingestor and QueryExecutor to handle neural sparse use case
refactoring in workflow_to_template_utils to handle neural sparse use case
added neural sparse template so it is propagated in the "Create new workflow" tab
Other minor changes, mostly related to readability:
adds a Document UI component for general readability and clearer understanding of the end-to-end ingest data flow
added logic to parse the edges and only include relevant ones in the downstream Workflow template (e.g., an edge to/from the UI-specific Document component should be ignored in the backend template)
minor changes to component names and input/outputs to match the data flow better
removes the create v. existing tabs in the component details component. for now, the scope is only creation
removes 'Search' meta block in DnD workspace
Testing:
fixed a bug of model ID not propagating to the ingest pipeline correctly for certain edge cases
ensured both pretrained and existing models work for semantic search case
ensured both pretrained and existing models work for neural sparse search case
Demo video:
creating and provisioning a neural sparse workflow using a pretrained sparse encoding model provided by ML commons plugin
[x] Commits are signed per the DCO using --signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Description
This PR onboards an out-of-the-box neural sparse search workflow configuration by adding a base template, the base set of workspace nodes/edges, and updated/refactored logic in the parsing to produce a usable workflow template that can provision an ingest pipeline and index (and optionally a pretrained neural sparse model). More specifically:
ModelField
SparseEncoderTransformer
ML transformer componentIngestor
andQueryExecutor
to handle neural sparse use caseworkflow_to_template_utils
to handle neural sparse use caseOther minor changes, mostly related to readability:
Document
UI component for general readability and clearer understanding of the end-to-end ingest data flowDocument
component should be ignored in the backend template)Testing:
Demo video:
neural_sparse
query clausescreen-capture (27).webm
Issues Resolved
Makes progress on #68
Check List
--signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.