Open pbartusch opened 3 years ago
The major goal of this story is to:
Constraints:
v4
is suggested. Prior versions of SMUI need to be adjusted to that new configuration specification.SMUI's deployment options (plan):
local
deployment: for DEV setup, no need for a Querqy-enabled search engine.solr-local
deployment: evolution of conf/smui2solr.sh
.git-repository
deployment: evolution of conf/smui2git.sh
.elasticsearch
deployment: new deployment procedure to support Elasticsearch.The following refactoring steps are suggested in order to sustain maintainability for SMUI with respect to the deployment options:
conf/deployment
). Renaming should be done accordingly (e.g. conf/deployment/solr-local.sh
, see above).test/services/RulesTxtDeploymentServiceConfigVariantsSpec.scala
).app/models/SolrIndex.scala
and all of its references),/api/v1/solr-index
, see conf/routes
),app/models/FeatureToggleModel.scala
into an explicit deployment model under app/models/config
.app/services/RulesTxtDeploymentService.scala
accordingly.export
folder for rules.txt
files (as a default to all the deployment scripts above).local
should be the default deployment (especially for a "Quickstart", see documentation).Explicit deployment configuration:
v3
. Those should be resolved (see above) and made explicit (using custom parameters specific to the deployment procedure), e.g.:smui.deployment.PRELIVE = {
'procedure': 'conf/deployment/git-repository.sh',
'params': {
'repo': 'https+ssh://my-repo-on.domain.tld'
...
}
}
{SMUI_DEPLOYMENT_PROCEDURE}.sh {DEPLOYMENT_INSTANCE} {RULES_COLLECTION_NAME} {EXPORT_PATH} {RULES_TXT_FILE(S) as ordered comma separated list} {PROCEDURE_SPECIFIC_PARAMS as --key=value}
e.g.:
git-repository.sh PRELIVE ecommerce /export common-rules.txt,decompound-rules.txt,spelling-rules.txt --repo=https+ssh://my-repo-on.domain.tld ...
smui.deployment.PRELIVE = {
'procedure': 'services.deployment.ElasticsearchDeployment',
'params': {
'url': 'https://my-elasticsearch-instance-on.domain.tld'
...
}
}
smui.conf
file (like Chorus does it). There should be no env var option as the configuration is too complex (local
deployment will remain default).v4
- is breaking it should be considered to split configuration into "setup" & "customisation" in general, where only "setup" configurations can be controlled via env vars, and all "customisation" configurations should be done via a smui.conf
(see above, this could account for e.g. toggle.activate-spelling
). This should include the tag configuration, now being done via an explicit, extra JSON file.Note: As time of planning this major change, SMUI refactorings (splitting frontend & backend implementation) take place. The following branches are relevant:
I'm planning on removing the jackhanna
script in favour of the single upload capability for ConfigSets, which should probably be how the zk-solr-cloud.sh
interacts with Solr! Maybe rename it to solr-cloud.sh
? See https://github.com/querqy/chorus/issues/22.
@epugh @pbartusch Please keep in mind that https://github.com/querqy/querqy/issues/76 will be a breaking change: the rules.txt as no longer be deployed as such but the rules will be embedded into a JSON HTTP request (very similar to Querqy for ES). Also, the direct interaction with ZK or any direct interaction with the configset will be removed (and the collection reload as well).
It is very likely, that we can test a release candidate in production as soon as January. I think we need this kind of 'beta version' this time given the scope of the change.
Long story short: please do not invest any time into making the current deployment of rules.txt to Solr better - it will be replaced very soon.
@renekrie , thanks for the hint.
Long story short: please do not invest any time into making the current deployment of rules.txt to Solr better - it will be replaced very soon.
that is not the plan. the focus of the concept described above lies on different deployment options in general.
that is not the plan. the focus of the concept described above lies on different deployment options in general.
@pbartusch I was a bit worried because earlier you said:
Chorus should be adjusted to the newly adopted zk-solr-cloud deployment procedure as a first proof of concept.
I assume that zk-solr-cloud deployment will become outdated very soon.
ah. got it. ok , it wasnt ment to the be the focus, but I understand the concern. Thanks , @renekrie .
Then it seems better to make the smui2solrcloud.sh
a proof of concept for a custom deployment procedure. I will adjust https://github.com/querqy/smui/issues/56#issuecomment-745107324 accordingly.
@epugh , now I got your point as well. Regarding:
[...] in favour of the single upload capability for ConfigSets, which should probably be how the zk-solr-cloud.sh interacts with Solr
I suggest to add this deployment procedure (once its available in Solr/Querqy) to SMUI instead of Chorus as the solr-cloud.sh
you suggested.
I will not make this part of this issue/story (obviously ;-)), but we should develop it within the scope of SMUI and adjust Chorus accordingly.
@renekrie , will there stay the solr-local
deployment procedure possibility in Solr? (meaning: cp the rules.txt and then perform a core reload)
Or will that be deprecated as well?
This will be the same HTTP call like for SolrCloud
Just a heads-up: I've just merged a PR for https://github.com/querqy/querqy/issues/116 to querqy-core.
This would give you the option to manage ES/Solr specifics via templates in the rules file. For example, a down boost on a field could look like this:
notebook =>
UP(10): asus
<< field_down: factor=20 || fieldname=category || value=accessories >>
At the beginning of the file, you would have to prepend the search-engine-specific template:
# either Solr:
def field_down(factor, fieldname, value):
DOWN($factor): * $fieldname:(value)
# or Elasticsearch:
def field_down(factor, fieldname, value):
DOWN($factor): * "match": { "$fieldname": { "query": "$value" }}
If it helps, we could probably add docstring documentation to the templates à la:
def field_down(factor, fieldname, value):
"""Use this to penalise documents that contain a certain value in the specified field.
:param factor: the penalisation factor
:param fieldname: the field name
:param value: the field value
:type factor: float
:type fieldname: string
:type value: string
"""
DOWN($factor): * $fieldname:(value)
This would probably enable SMUI to generate a form input in the UI from the template. At the most advanced end, we could let users create and manage their own templates in SMUI, including for more complex function queries.
Do you think it might be useful to have the ability to define a raw query to a rule as well (i.e. everything after the '*')? E.g. as a specific option in the UI instead of choosing from suggested fields and putting a field for a value. The advantage would be to enable basically all use cases for rules through SMUI. It could enable Elastic Rules completely as a first step and circumvent the templates discussion and similar approaches. Tradeoff being the higher risk of human error when writing raw query syntax unless there is validation added to these inputs.
Update: It seems to be already possible through toggle.ui-concept.all-rules.with-solr-fields=false which renders the Term as is and does not throw any validation errors. So import, UI edit, export seems to be all working with Elastic Rules.
@pbartusch Is there some activity planned on this issue? While refactoring, could the concept of SOLR_BASE_URL (e.g. http://localhost:8983/solr) versus SOLR_HOST (that then gets hardcoded build to the SOLR_BASE_URL). The advantage of the SOLR_BASE_URL would be that it will enable the customer to use http and https (and a possible different application root replacing "/solr").
See #82 which is specific to @pbartusch comment back in December 2020!
Deployment possibilities for SMUI have grown rapidly. The configuration is hard to understand & corresponding code is hard to maintain - this includes:
especially.
Approach:
Step#1: document all deployment possibilities, that should be supported by SMUI (already take future Elasticsearch support , #43 , into account). Step#2: derive a config schema (for application.conf). Step#3: refactor the code (breaking change)