ArcadeData / arcadedb

ArcadeDB Multi-Model Database, one DBMS that supports SQL, Cypher, Gremlin, HTTP/JSON, MongoDB and Redis. ArcadeDB is a conceptual fork of OrientDB, the first Multi-Model DBMS. ArcadeDB supports Vector Embeddings.
https://arcadedb.com
Apache License 2.0
497 stars 60 forks source link

Begin to document areas where we can take advantage of Java's Vector API #53

Open thadguidry opened 3 years ago

thadguidry commented 3 years ago

Required Reading: https://www.youtube.com/watch?v=VYo3p4R66N8 Video: https://www.youtube.com/watch?v=VYo3p4R66N8

One of the goals of ArcadeDB is to maximize data parallelism using various techniques in low level Java, and design paradigms. Data parallelism can likely be further enhanced for performance with hardware CPU's such as Intel and ARM and possibly others later (Apple Mx?) by using the Java Vector API (currently now in Second Incubator).

Additionally, Java JVM also already provides and uses many intrinsic functions. In the future some of these will be re-rewritten to take first-hand advantage of the new Vector API as discussed in the video.

It would be good to use this issue to begin to document areas of the code base that likely would be starting points for exploration where data parallelism can likely take extra advantage of hardware.

lvca commented 3 years ago

Hi @thadguidry very interesting. Not sure if we can use it, I see the Vector API more for math computation. We could use something faster than Unsafe to work at array[] level for index lookup of the LsM index, but not sure if Vector can help on that.

thadguidry commented 3 years ago

It's not only for Math... Strings as well... you just need to poke around: https://www.researchgate.net/publication/220811600_Accelerating_search_and_recognition_workloads_with_SSE_42_string_and_text_processing_instructions and https://www.slideshare.net/RednaxelaFX/green-teajug-hotspotintrinsics02232013

On Intel with SSE4.2 support here's the String Compare registers for example mentioned in the first publication: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=SSE4_2&cats=String%20Compare

lvca commented 3 years ago

I didn't know about intrinsic! Yes, we could totally use it for comparing with indexes and in general for string compare.

lvca commented 1 year ago

We developed this topic on https://github.com/ArcadeData/arcadedb/discussions/995. Waiting for the release of the first vector model for ArcadeDB and the support for Vector API could be in a separate Java21+ module.