EvidentSolutions / elasticsearch-analysis-raudikko

Finnish language analysis for Elasticsearch using Raudikko
GNU General Public License v3.0
10 stars 8 forks source link

Support for OpenSearch? #10

Closed back-2-95 closed 1 year ago

back-2-95 commented 1 year ago

As ElasticSearch 7 and 8 will be kinda last "opensource" versions, are you gonna support OpenSearch for the plugin?

I just briefly tried to install the plugin to ES7 docker image and got the following error:

Exception in thread "main" java.lang.IllegalArgumentException: property [opensearch.version] is missing for plugin [elasticsearch-analysis-raudikko]

Here is my Dockerfile:

FROM uselagoon/opensearch-2:latest as opensearch

# See https://github.com/EvidentSolutions/elasticsearch-analysis-raudikko/releases
# for suitable version
ARG ES_VERSION=7.10.2
ARG RAUDIKKO=https://github.com/EvidentSolutions/elasticsearch-analysis-raudikko/releases/download/v0.1.1/elasticsearch-analysis-raudikko-0.1.1-es${ES_VERSION}.zip

RUN opensearch-plugin install ${RAUDIKKO}

RUN opensearch-plugin list
komu commented 1 year ago

Yeah, I think it makes sense to support both as long as it's easily done. Currently I'm not actively involved in any Elastic/OpenSearch -work, but I would be happy to take a PR.

back-2-95 commented 1 year ago

Let's assume the plugin would just need that missing property:

komu commented 1 year ago

build.gradle.kts defines a task distributionZip which is registered as an artifact, so if you just run ./gradlew assemble you'll end with the plugin zip in build/distributions directory.

Runtime-wise the changes probably involves just adding the property in the plugin, but since normally the same source gets tested an built against multiple different versions of Elesticsearch, we probably want to do that here as well. So we want to be able to pass in a build property that builds and tests against OpenSearch artifacts and produces a zip built for OpenSearch.

Then I suppose we want to add jobs in .github/workflows/build.action and .github/workflows/release.action for making sure that the artifacts get built whenever we build the apps. I can make those changes so you don't have to bother about learning that setup.

So I suppose we'd want that running:

./gradlew build -DopensearchVersion=X.Y.Z

runs and tests the code against proper version of OpenSearch and then builds elasticsearch-analysis-raudikko-0.1-opensearch-X.Y.Z.zip (or perhaps opensearch-analysis-raudikko-osX.Y.Z.zip?). If I'll get a pull version that does that, I can then integrate it into the build.

Regarding the name of the zip, I'm on the fence. I think I prefer the former name as it's really the same project and would make it easier for people to find the right project if they bump into the zip. But of course the name is a bit silly. But still, I guess that being explicit that this is just another profile of the same codebase is more important than avoiding silliness. Any opinions?

back-2-95 commented 1 year ago

Regarding the name: you mean using os? Yes, I noticed the weirdness in that when talking internally about es and os.

back-2-95 commented 1 year ago

I can try the create PR with these instructions.

komu commented 1 year ago

Yeah, nowadays the Elasticsearch version is named elasticsearch-analysis-raudikko-0.1.2-es-X.Y.Z.zip, which kinda made sense back in the days, "es" probably associating to "elasticsearch" in that context.

But I think that naming producing an artifact like elasticsearch-analysis-raudikko-0.1.2-osX.Y.Z.zip does nobody any favors, I'm specifically thinking of someone who is trying to pick the correct zip to download from release page:

Screenshot 2023-03-10 at 10 19 23

So if we are going to have the string elasticsearch in the artifact, than it's probably a good idea to include opensearch in full as well, instead of just writing os. But if the artifact name would start with opensearch, then perhaps repeating it in full is not that useful.

So I meant the two options as I wrote them, but of course I'm open to other ideas as well.

back-2-95 commented 1 year ago

I'm looking at this and I'm thinking that if I build elasticsearch and opensearch zips... do they include somehow different things OR is it basically the same zip saying "I can play with both"?

back-2-95 commented 1 year ago

Things got escalated fast and this is what happened: https://github.com/EvidentSolutions/elasticsearch-analysis-raudikko/pull/11

komu commented 1 year ago

I suppose that the contents of the zips end up almost identical for different versions. But at least they have been tested against the various systems. We have to build multiple zip-files even for different versions of Elasticsearch, because if a plugin is built fo Elasticsearch X.Y.Z, Elasticsearch X.Y.Z+1 will refuse to load it. For release 0.1.1 we ended up building for 8 different versions of Elasticsearch.

I understand that when plugins don't work, some users will contact Elasticsearch for support and they don't want to support plugins that are untested against the specific version, but not committing to any API stability even at the patch-level feels bad. But nothing we can do about that...

Anyway, I think that it's easiest to make it that given build only builds one of them. So something like:

val opensearchVersion = System.getProperty("opensearchVersion")

val versionSuffix = if (opensearchVersion == null) "es$elasticsearchVersion" else "$opensearch-$opensearchVersion"
version = "$baseVersion-$versionSuffix"
...

depedencies {
    implementation("fi.evident.raudikko:raudikko:0.1.2")

    if (opensearchVersion != null) {
        ... open search artifacts ...
    } else {
        compileOnly("org.elasticsearch:elasticsearch:$elasticsearchVersion")
        testImplementation("org.elasticsearch:elasticsearch:$elasticsearchVersion")
    }
}
marisgithub commented 1 year ago

Hi @komu ,

Any update on this topic? Just ran in same problem and since March it seems there is no "movement" to separate repo opensearch-analysis-raudikko.

Would love to help you with PR, but I am not an expert in this field, so any input from you is appreciated!

Thank you!

P.S. Maybe @back-2-95 You could help?

komu commented 1 year ago

Hi,

Sorry for the delay, I managed to stay completely away from GitHub for my summer vacation.

There was very little to do except just creating a new repository for the opensearch-version. Since there's very little code in the plugin and the APIs between OpenSearch and Elasticsearch and completely incompatible, I decided that it's simplest to create a new project, as described in my comment (https://github.com/EvidentSolutions/elasticsearch-analysis-raudikko/pull/11#issuecomment-1463664794) to the earlier PR.

I just never got around to do that before now.

So now there's a new repository: https://github.com/EvidentSolutions/opensearch-analysis-raudikko

I just published version 0.1.0 against OpenSearch 2.5.0 (which is what the original PR had), but realized that there have been several OpenSearch releases since and like Elasticsearch, OpenSearch requires that the plugin is built for the specific release that is targeted. So I build the bits for the latest 2.9.0 as well.

So now you should be able to download plugin from the release page and install it. If you are using another version of Opensearch, please let me now and I'll build a release for that as well.

So @back-2-95 and @marisgithub, if you could test the plugin and let me know if it works (and especially if it doesn't) it'd be great. I did install it on a dockerized version of OpenSearch and checked that calls to analyze-API return sensible results, but I didn't try to actually use it on an index.

Thanks for your patience.

marisgithub commented 1 year ago

Thanks a lot @komu, will let you know about results (good or bad)!

back-2-95 commented 1 year ago

Hi @komu and thanks! I'd like to ask release for 2.8.0 as our hosting partner has image only for that https://github.com/uselagoon/lagoon-images/blob/main/images/opensearch/2.Dockerfile

komu commented 1 year ago

@back-2-95 Sure! It should now be available on the release-page. For good measure I also built versions for 2.6.0 and 2.7.0 as well.

back-2-95 commented 1 year ago

Thx! Now the installation went through just fine:

docker compose build opensearch --progress plain                                                                                                                       
#1 [opensearch internal] load build definition from Dockerfile
#1 transferring dockerfile: 1.11kB done
#1 DONE 0.0s

#2 [opensearch internal] load .dockerignore
#2 transferring context: 188B done
#2 DONE 0.0s

#3 [opensearch internal] load metadata for docker.io/uselagoon/opensearch-2:latest
#3 ...

#4 [opensearch auth] uselagoon/opensearch-2:pull token for registry-1.docker.io
#4 DONE 0.0s

#3 [opensearch internal] load metadata for docker.io/uselagoon/opensearch-2:latest
#3 DONE 1.5s

#5 [opensearch opensearch 1/3] FROM docker.io/uselagoon/opensearch-2:latest@sha256:90b71dd70e53bc1eca9a360adb10ea8740daba137087e192b980f40f50d5e93a
#5 resolve docker.io/uselagoon/opensearch-2:latest@sha256:90b71dd70e53bc1eca9a360adb10ea8740daba137087e192b980f40f50d5e93a done
#5 CACHED

#6 [opensearch opensearch 2/3] RUN opensearch-plugin install https://github.com/EvidentSolutions/opensearch-analysis-raudikko/releases/download/v0.1.0/opensearch-analysis-raudikko-0.1.0-os2.8.0.zip
#6 0.440 -> Installing https://github.com/EvidentSolutions/opensearch-analysis-raudikko/releases/download/v0.1.0/opensearch-analysis-raudikko-0.1.0-os2.8.0.zip
#6 0.440 -> Downloading https://github.com/EvidentSolutions/opensearch-analysis-raudikko/releases/download/v0.1.0/opensearch-analysis-raudikko-0.1.0-os2.8.0.zip
[=================================================] 100%??
#6 1.711 -> Installed opensearch-analysis-raudikko with folder name opensearch-analysis-raudikko
#6 DONE 1.8s

#7 [opensearch opensearch 3/3] RUN opensearch-plugin list
#7 0.464 opensearch-analysis-raudikko
#7 0.464 opensearch-anomaly-detection
#7 0.464 opensearch-asynchronous-search
#7 0.464 opensearch-geospatial
#7 0.464 opensearch-job-scheduler
#7 0.465 opensearch-knn
#7 0.465 opensearch-ml
#7 0.465 opensearch-neural-search
#7 0.465 opensearch-performance-analyzer
#7 0.465 opensearch-security-analytics
#7 0.465 opensearch-sql
#7 DONE 0.5s

#8 [opensearch] exporting to docker image format
#8 exporting layers 0.1s done
#8 exporting manifest sha256:f69f8517a509f95ca8172f069f1534464eb0b52bc144bf29b02eee83a0d2c21d
#8 exporting manifest sha256:f69f8517a509f95ca8172f069f1534464eb0b52bc144bf29b02eee83a0d2c21d done
#8 exporting config sha256:98ce667e8fec51f29b49a87e6a021f51064483105990a2bd170bda295a12bb99 done
#8 sending tarball
#8 sending tarball 11.3s done
#8 DONE 11.4s

#9 [opensearch opensearch] importing to docker
#9 DONE 6.8s