opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.61k stars 1.76k forks source link

[META]: SIMD adoption in OpenSearch #9423

Open heemin32 opened 1 year ago

heemin32 commented 1 year ago

Is your feature request related to a problem? Please describe. Lucene 9.7.0 introduced new incubating vector APIs from Java 20 which utilize SIMD hardware in x86 AVX2 or later, and ARM NEON platforms. The feature is disabled by default. For a OpenSearch user to enable the feature, the user should pass a command line parameter during launch time. Also, OpenSearch should be running using jdk 20 or 21. To take advantage of SIMD optimization we will need OpenSearch to first run with JDK-21 by default and with SIMD modules enabled. This issue is to track items needed for SIMD enablement and potential area of improvements.

Additional context

K-NN performance comparison (https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool)

  1. Dataset: SIFT (128 dimensions, 1M docs)
  jdk17 jdk20-simd diff
total ingest time (ms) 173861.47249 110505.82751 -36.4%
query latency p50 (ms) 8.9 8.2 -7.8%
query latency p90 (ms) 10.1 9 -10.9%
query latency p99 (ms) 11.1 10 -9.9%
  1. Dataset: GIST (960 dimensions, 1M docs)
  jdk17 jdk20-simd diff
total ingest time (ms) 1114000.82815 919003.77045 -17.5%
query latency p50 (ms) 32.5 18.2 -44.0%
query latency p90 (ms) 36.1 20.1 -44.3%
query latency p99 (ms) 38.8 24.5 -37.3%
reta commented 1 year ago

@heemin32 I think we would be looking into switching main / 2.x to JDK-21 (due Sep 19th) since JDK-20 is not LTS (or whatever the supported long time release means)

heemin32 commented 1 year ago

@vamshin JDK-21 GA date is 19th of Sep. OpenSearch code freeze date for 2.10.0 is 5th of Sep. That mean, we might not be able to enable SIMD for OpenSearch 2.10.0.

reta commented 1 year ago

That mean, we might not be able to enable SIMD for OpenSearch 2.10.0.

To note here, bundled JDK for 2.10.0 would still be JDK-17 but users could try to use JDK-20 instead at your own risks (although we have not run 2.x on JDK-20 yet it should work out of the box).

vamshin commented 1 year ago

@reta do you see issues if we bundle jdk-20 by default in 2.10 to take advantage of SIMD out of box for k-NN users? It would go through all the regular tests we do for the release

reta commented 1 year ago

@reta do you see issues if we bundle jdk-20 by default in 2.10 to take advantage of SIMD out of box for k-NN users? It would go through all the regular tests we do for the release

yes, there are at least 3 issues here:

Primarily I think the efforts should be spent on JDK-21 taking into account it is weeks away (not months or years)

vamshin commented 1 year ago

@reta thanks for the details. Looks like we will have to push to 2.11 then

sohami commented 1 year ago

@heemin32 I have repurposed this issue to discuss/explore the generic usage of SIMD in OpenSearch and added KNN related tasks as sub points in description. Let me know if you have any concerns or I am missing anything here

heemin32 commented 1 year ago

Have done more testing regarding k-nn feature and result is available in https://github.com/opensearch-project/k-NN/issues/1062

ketanv3 commented 12 months ago

Another use-case in the date_histogram aggregation I've been exploring: #10392

kkhatua commented 11 months ago

@heemin32 would it be possible to run with JDK21 and update the numbers here, since JDK21 has LTS ? That'll help us prioritize this.

heemin32 commented 11 months ago

@heemin32 would it be possible to run with JDK21 and update the numbers here, since JDK21 has LTS ? That'll help us prioritize this.

Have no bandwidth as of now but will try to get the result with JDK21.

macohen commented 10 months ago

note that the Vector API is still incubating in JDK 22: https://openjdk.org/jeps/460. Looks like potential minor revs in the API. Would we keep this off by default even when it is out of incubation or will it be safe to turn on once out of incubation? I think the Vector API itself determines what to do with or without SIMD present on the processor.

reta commented 10 months ago

Would we keep this off by default even when it is out of incubation or will it be safe to turn on once out of incubation?

Apache Lucene has Vector API support so having it on by default has benefits.

macohen commented 10 months ago

Would we keep this off by default even when it is out of incubation or will it be safe to turn on once out of incubation?

Apache Lucene has Vector API support so having it on by default has benefits.

100%; still wanted to call out the slight risk that the Vector API is incubating. I do know and get that Lucene is accepting that risk so we are, too...