bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.97k stars 9.2k forks source link

[bitnami/redis] REDIS AI K8's STACK - Here is Extended Bitnami Docker Redis Image with RediSearch, RedisJson and [DEP]RedisAI Contained in the Same Image - Can Be Used With Kubernetes - Help and suggestions on how to distribute this #23950

Closed xtianus79 closed 5 months ago

xtianus79 commented 8 months ago

Name and Version

bitnami/redis

What is the problem this feature will solve?

Alright guys I got to it. I spent 3 days figuring this all out. Below will be the dockerfile changes which was the only file change.

Motivations for this is that I want a true to the bitnami Redis Image that can be used with Kubernetes (K8's) and not have any alteration to the values.yaml file itself. In this way, it is opinionated of giving you core redis with the modules of RediSearch and RedisJson.

I also, (felt adventurous) went after the RedisAI module. While it is there I would suggest commenting it out as officially I think it has be deprecated in favor of GenAI and vector search which is the primary reason for including JSON and Search. With that said, it works the module is there but when testing it against my GPU in WSL I couldn't get it to load a module. So if someone can figure out that part I wouldn't mind giving it more of a go.

For reference - Again, the other parts of the module work but not this important part.

I have no name!@970fbc429ddd:/$ cat mymodel.pb | redis-cli -x AI.MODELSTORE mymodel TF GPU INPUTS 1 x OUTPUTS 1 StatefulPartitionedCall BLOB
(error) ERR Could not load backend

You may want other Redis Stack Modules such as RedisTimeSeries and RedisBloom and there is good information in this git issues that lead you to what you need for the RedisTimeSeries information.

Importantly, the core motivation for this is that AI, specifically Generative AI, has taken over many development tasks. As a result, I believe Redis and all of its performance would be a perfect fit for RAG (Retrieval Augmented Generation) / Vector search / Search techniques that your project pipelines for AI workloads may need.

Lastly, I want to find a path forward for integrating this with the bitnami official release for Kubernetes workloads. I am not sure that putting these modules directly in and loading everyone is what people want and or need. What I like about this methodology is it has clear separation of concerns and maintains the core bitnami framework

  1. Perhaps people want a mechanism to choose which modules they want so a selection mechanism in either the values file for a cmd reference
  2. How do we keep in sync the bitnami/redis image along with the dependency images needed to load the modules. Should this be a git fork and a docker (fork) per se. I think we technically only need the docker fork only as the file in theory should work without much fuss. However, it will be important to make sure the module images load and work with each upgrade cycle over time.
  3. Anything I may be missing or suggestions you may have

Notable additions as of 2/27/2024: (note sure as of yet how to differentiate RedisVL vs the langchain RedisVectorStore) https://redis.io/docs/interact/search-and-query/ <<< RediSearch https://redis.com/blog/introducing-the-redis-vector-library-for-enhancing-genai-development/ <<< This allows for a python implemenation of RedisVL which under the hood I think is utilizing RediSearch so you need that module for this. https://www.redisvl.com/user_guide/getting_started_01.html https://www.redisvl.com/user_guide/hash_vs_json_05.html <<< Redis hash vs RedisJson https://python.langchain.com/docs/integrations/vectorstores/redis << python vector store and langchain https://js.langchain.com/docs/integrations/vectorstores/redis <<< for node / ts

RedisVectorStore Redis is a fast open source, in-memory data store. As part of the Redis Stack, RediSearch is the module that enables vector similarity semantic search, as well as many other types of searching. https://redis.io/docs/get-started/vector-database/ https://redis.io/docs/interact/search-and-query/advanced-concepts/vectors/ https://redis.io/docs/interact/search-and-query/indexing/ https://redis.io/docs/interact/search-and-query/query/ https://redis.io/docs/interact/search-and-query/advanced-concepts/ https://redis.io/docs/interact/search-and-query/advanced-concepts/query_syntax/

THE CODE: Dockerfile Update Redis-Cluster. Updated search module to 2.8. Deployments works for Azure AKS. Set permissions. Changed to opt/bitnami/redis-cluster/etc directory. Updated image for deploying Redisearch to debian:bullseye

values.yaml important bits

  configmap: |
    loadmodule /opt/bitnami/redis/etc/rejson.so
    loadmodule /opt/bitnami/redis/etc/redisearch.so
# Stage 1: Build environment for RedisJSON with an appropriate Python version
FROM python:3.9 AS redisjson-build-env

# Install system dependencies and clean up in one layer
RUN apt-get update && \
    apt-get install -y build-essential git curl && \
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* 

# Set PATH for Rust
ENV PATH="/root/.cargo/bin:${PATH}"

# Clone and build RedisJSON
WORKDIR /build/redisjson
RUN git clone --recursive https://github.com/RedisJSON/RedisJSON.git . && \
    python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip setuptools wheel && \
    ./sbin/setup && \
    cargo build --release

# Verify the build artifacts for librejson.so
RUN echo "Verifying librejson.so build artifacts:" && \
    ls -la /build/redisjson/target/release/ && \
    echo "librejson.so build completed successfully if listed above."

# Stage 2: Build environment for RediSearch
FROM debian:bullseye AS redisearch-build-env

# Install Python, build tools, CMake, and git
RUN apt-get update && \
    apt-get install -y python3 python3-pip python3-venv build-essential libboost-all-dev cmake git && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Clone and build RedisSearch
WORKDIR /build/redisearch
RUN git clone --recursive --branch v2.8.12 https://github.com/RediSearch/RediSearch.git . && \
    python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip setuptools wheel && \
    pip install conan && \
    make

# After building, list the contents of the expected output directory
# Adjust the path below if the build artifacts are placed in a different location
# After building, list the contents of the build directory to confirm the existence of redisearch.so
RUN ls -la /build/redisearch/bin/linux-x64-release/search && \
    echo "redisearch.so build artifacts:" && \
    ls -la /build/redisearch/bin/linux-x64-release/search/redisearch.so && \
    echo "redisearch.so build completed successfully."

# Copyright VMware, Inc.
# SPDX-License-Identifier: APACHE-2.0
# Stage 4: Final image

FROM docker.io/bitnami/minideb:bookworm

ARG TARGETARCH

LABEL com.vmware.cp.artifact.flavor="sha256:c50c90cfd9d12b445b011e6ad529f1ad3daea45c26d20b00732fae3cd71f6a83" \
      org.opencontainers.image.base.name="docker.io/bitnami/minideb:bookworm" \
      org.opencontainers.image.created="2024-03-31T19:42:32Z" \
      org.opencontainers.image.description="Application packaged by VMware, Inc" \
      org.opencontainers.image.licenses="Apache-2.0" \
      org.opencontainers.image.ref.name="7.2.4-debian-12-r11" \
      org.opencontainers.image.title="redis-cluster" \
      org.opencontainers.image.vendor="VMware, Inc." \
      org.opencontainers.image.version="7.2.4"

ENV HOME="/" \
    OS_ARCH="${TARGETARCH:-amd64}" \
    OS_FLAVOUR="debian-12" \
    OS_NAME="linux"

COPY prebuildfs /
SHELL ["/bin/bash", "-o", "errexit", "-o", "nounset", "-o", "pipefail", "-c"]
# Install required system packages and dependencies
USER root
RUN apt-get update && \
    apt-get install -y ca-certificates curl libgomp1 libssl3 procps && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Copy compiled modules from the build stages
# COPY --from=redisjson-build-env /build/redisjson/target/release/librejson.so /usr/lib/rejson.so
# COPY --from=redisearch-build-env /build/redisearch/bin/linux-x64-release/search/redisearch.so /usr/lib/redisearch.so

COPY --from=redisjson-build-env /build/redisjson/target/release/librejson.so /opt/bitnami/redis/etc/rejson.so
COPY --from=redisearch-build-env /build/redisearch/bin/linux-x64-release/search/redisearch.so /opt/bitnami/redis/etc/redisearch.so

RUN mkdir -p /tmp/bitnami/pkg/cache/ ; cd /tmp/bitnami/pkg/cache/ ; \
    COMPONENTS=( \
      "wait-for-port-1.0.7-10-linux-${OS_ARCH}-debian-12" \
      "redis-7.2.4-3-linux-${OS_ARCH}-debian-12" \
    ) ; \
    for COMPONENT in "${COMPONENTS[@]}"; do \
      if [ ! -f "${COMPONENT}.tar.gz" ]; then \
        curl -SsLf "https://downloads.bitnami.com/files/stacksmith/${COMPONENT}.tar.gz" -O ; \
        curl -SsLf "https://downloads.bitnami.com/files/stacksmith/${COMPONENT}.tar.gz.sha256" -O ; \
      fi ; \
      sha256sum -c "${COMPONENT}.tar.gz.sha256" ; \
      tar -zxf "${COMPONENT}.tar.gz" -C /opt/bitnami --strip-components=2 --no-same-owner --wildcards '*/files' ; \
      rm -rf "${COMPONENT}".tar.gz{,.sha256} ; \
    done
RUN apt-get autoremove --purge -y curl && \
    apt-get update && apt-get upgrade -y && \
    apt-get clean && rm -rf /var/lib/apt/lists /var/cache/apt/archives
RUN chmod g+rwX /opt/bitnami
RUN find / -perm /6000 -type f -exec chmod a-s {} \; || true

COPY rootfs /
RUN chmod +x /opt/bitnami/scripts/redis-cluster/postunpack.sh && /opt/bitnami/scripts/redis-cluster/postunpack.sh
# RUN /opt/bitnami/scripts/redis-cluster/postunpack.sh
ENV APP_VERSION="7.2.4" \
    BITNAMI_APP_NAME="redis-cluster" \
    PATH="/opt/bitnami/common/bin:/opt/bitnami/redis/bin:$PATH"

# Set modules allowance
RUN echo "enable-module-command yes" >> /opt/bitnami/redis/etc/redis.conf

# ENV REDIS_EXTRA_FLAGS="--loadmodule /usr/lib/rejson.so --loadmodule /usr/lib/redisearch.so"
ENV REDIS_EXTRA_FLAGS="--loadmodule /opt/bitnami/redis/etc/rejson.so --loadmodule /opt/bitnami/redis/etc/redisearch.so"
# ENV REDIS_EXTRA_FLAGS="--loadmodule /usr/lib/rejson.so --loadmodule /usr/lib/redisearch.so --loadmodule /usr/lib/redisai.so"
# ENV REDIS_EXTRA_FLAGS="--loadmodule /usr/lib/rejson.so"

# Set the execute permission for the entrypoint script
RUN chmod +x /opt/bitnami/scripts/redis-cluster/entrypoint.sh

# Set the execute permission for the setup script
RUN chmod +x /opt/bitnami/scripts/redis-cluster/setup.sh

# Set the execute permission for the run script
RUN chmod +x /opt/bitnami/scripts/redis-cluster/run.sh

EXPOSE 6379

USER 1001
ENTRYPOINT [ "/opt/bitnami/scripts/redis-cluster/entrypoint.sh" ]
CMD [ "/opt/bitnami/scripts/redis-cluster/run.sh" ]

Previous Dockerfile Single Redis Instance ***May have errors without pinpointing module versions. I reduced the file here to only the parts needed. I would refer to the above code and apply this to a single cluster if needed.

# Stage 1: Build environment for RedisJSON with an appropriate Python version
FROM python:3.9 AS redisjson-build-env

# Install system dependencies and clean up in one layer
RUN apt-get update && \
    apt-get install -y build-essential git curl && \
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* 

# Set PATH for Rust
ENV PATH="/root/.cargo/bin:${PATH}"

# Clone and build RedisJSON
WORKDIR /build/redisjson
RUN git clone --recursive https://github.com/RedisJSON/RedisJSON.git . && \
    python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip setuptools wheel && \
    ./sbin/setup && \
    cargo build --release

# Verify the build artifacts for librejson.so
RUN echo "Verifying librejson.so build artifacts:" && \
    ls -la /build/redisjson/target/release/ && \
    echo "librejson.so build completed successfully if listed above."

# Stage 2: Build environment for RediSearch
FROM debian:bullseye AS redisearch-build-env

# Install Python, build tools, CMake, and git
RUN apt-get update && \
    apt-get install -y python3 python3-pip python3-venv build-essential libboost-all-dev cmake git && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Clone and build RedisSearch
WORKDIR /build/redisearch
RUN git clone --recursive --branch v2.8.12 https://github.com/RediSearch/RediSearch.git . && \
    python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip setuptools wheel && \
    pip install conan && \
    make

# After building, list the contents of the expected output directory
# Adjust the path below if the build artifacts are placed in a different location
# After building, list the contents of the build directory to confirm the existence of redisearch.so
RUN ls -la /build/redisearch/bin/linux-x64-release/search && \
    echo "redisearch.so build artifacts:" && \
    ls -la /build/redisearch/bin/linux-x64-release/search/redisearch.so && \
    echo "redisearch.so build completed successfully."

# Stage 3: Build environment for RedisAI with GPU support
FROM nvidia/cuda:12.3.1-devel-ubuntu20.04 AS redisai-build-env

# Use DEBIAN_FRONTEND=noninteractive to avoid interactive prompts during build
ENV DEBIAN_FRONTEND=noninteractive

# Install base dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    python3 python3-pip python3-venv \
    git build-essential wget cmake unzip && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Copy the cuDNN local repository package
COPY cudnn-local-repo-ubuntu2004-9.0.0_1.0-1_amd64.deb /tmp

# Install the cuDNN local repository package
RUN dpkg -i /tmp/cudnn-local-repo-ubuntu2004-9.0.0_1.0-1_amd64.deb && \
    cp /var/cudnn-local-repo-ubuntu2004-9.0.0/cudnn-*-keyring.gpg /usr/share/keyrings/ && \
    apt-get update && \
    apt-get -y install cudnn-cuda-12

# Set environment variables for dynamic linker
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Reset DEBIAN_FRONTEND to its default value
ENV DEBIAN_FRONTEND=

# [Insert additional installation steps here, such as CUDA if not included in the base image]

# Clone RedisAI repository and setup virtual environment
WORKDIR /build/redisai
RUN git clone --recursive https://github.com/RedisAI/RedisAI.git . && \
    python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip setuptools wheel

# Build dependencies with GPU support and build RedisAI module
RUN bash get_deps.sh gpu && \
    make -C opt clean ALL=1 && \
    make -C opt GPU=1

# After building, list the contents of the expected output directory
# Adjust the path below if the build artifacts are placed in a different location
# Confirm the existence of RedisAI build artifacts
RUN echo "RedisAI build artifacts:" && \
    ls -la /build/redisai/install-gpu

.....

COPY prebuildfs /
SHELL ["/bin/bash", "-o", "errexit", "-o", "nounset", "-o", "pipefail", "-c"]
# Install required system packages and dependencies
USER root
RUN apt-get update && \
    apt-get install -y ca-certificates curl libgomp1 libssl3 procps && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Copy compiled modules from the build stages
COPY --from=redisjson-build-env /build/redisjson/target/release/librejson.so /usr/lib/rejson.so
COPY --from=redisearch-build-env /build/redisearch/bin/linux-x64-release/search/redisearch.so /usr/lib/redisearch.so
COPY --from=redisai-build-env /build/redisai/install-gpu/redisai.so /usr/lib/redisai.so

.....

COPY rootfs /
RUN chmod +x /opt/bitnami/scripts/redis/postunpack.sh && /opt/bitnami/scripts/redis/postunpack.sh
ENV APP_VERSION="7.2.4" \
    BITNAMI_APP_NAME="redis" \
    PATH="/opt/bitnami/common/bin:/opt/bitnami/redis/bin:$PATH"
ENV REDIS_EXTRA_FLAGS="--loadmodule /usr/lib/rejson.so --loadmodule /usr/lib/redisearch.so --loadmodule /usr/lib/redisai.so"
# ENV REDIS_EXTRA_FLAGS="--loadmodule /usr/lib/rejson.so"
.....

After speaking with Redis there may be a future update where RedisAI can do the embeddings as of now you would need other tooling for this. This could become very useful. Remember, as of now this is not necessary but here if you want to go on an adventure and try to see what you can do with the bits that are there.

For note taking purposes here is some of the work and links needed to get RedisAI working on WSL. This involves installing CUDA for WSL and cuDNN download for linux Ubuntu (default wsl linux image) The reason you need WSL for is because when you run the cli from the docker container for testing you are going through actual WSL for docker. That's how docker on windows works. In short, your bitnami image is linux and installing with Ubuntu the RedisAI module and you need a way to test it the same way on your local.

*Note: This is if you have an Nvidia Card such as an RTX 30 series or above (not sure about 20 series). If you are using only CPU you would need to adjust the above code a little and not worry about installing the cuda & cudnn package.

Starting here: https://redis.io/docs/about/about-stack/ >>> notice RedisAI is not currently here as it was removed but may come back https://oss.redis.com/redisai/quickstart/ && https://oss.redis.com/redisai/ >>> base docs for redis AI https://docs.nvidia.com/cuda/wsl-user-guide/index.html#wsl2-system-requirements >>> Info for wsl and installing cuda https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local >>> download wls linux x86 cuda nvidia-smi checking gpu against cuda installation https://www.tensorflow.org/install/pip#windows-wsl2_1 >>> For GPU you have to install tensorflow this way

# For GPU users
pip install tensorflow[and-cuda]
# For CPU users
pip install tensorflow

python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" >>> for CPU (even if you have gpu you it will default to cpu if you want or it can't find cpu)

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" >>> GPU test

https://redis.io/docs/get-started/vector-database/ >>> Vector database things

**I hope this helps as it was a little tedious getting everything just right but I got the module loaded and it does use the GPU but some aspects weren't working such as loading a module with tensorflow. Kind of the point but hopefully Redis takes this back up at some point.

What is the feature you are proposing to solve the problem?

I would like this to be integrated and maintained within the bitnami community.

There should be a way to choose which modules are needed and applied through the values.yaml file.

What alternatives have you considered?

Many over the years and this seems like the best working solution

javsalgar commented 8 months ago

Hi!

Thank you so much for the contribution! I will forward this to the product team but if I remember correctly there were modules that had license restrictions, which would prevent us to package and redistribute them.

xtianus79 commented 7 months ago

Hi!

Thank you so much for the contribution! I will forward this to the product team but if I remember correctly there were modules that had license restrictions, which would prevent us to package and distribute them.

@javsalgar Ahhhhaaa I caught ya. LOL JK. Soooo that was reason. Ug-huh.

I think you guys are fine because number 1 the license agreement from when I read, it because it was decision of why I choose Redis, the license is clear and states that you can't sell it as product itself.

From their documents on licensing

For RSALv2

Commercialize the software or provide it to others as a managed service Remove or obscure any licensing, copyright, or other notices

I don't think that is what you guys are doing

Plus you can choose which model you want to go with being RSALv2 + SSPLv1.

Also, good news, this is pulling the source code and not binaries that's why i did it that way. The pulling of the binaries maintains that we are not doing things for our own commercial re-distribution purposes. And if we / you guys did(or wanted to) as long as it is open source and transparent to the public that is fine too. So you can choose 2 options for this distribution SSPLv1 (winner) or the RSALv2 (if you don't have any commercial / managed services around it.)

SSPL is a source-available license created by MongoDB, who set out to craft a license that embodied the ideals of open source, allowing free and unrestricted use, modification, and redistribution, with the simple requirement that if you provide the product as a service to others, you must also publicly release any modifications as well as the source code of your management layers under SSPL.

SSPL is based on GPLv3, and is considered a copyleft license. This means that if you use the source code and create derivative works, those derivative works must also be licensed under SSPL and released publicly. For more information, MongoDB has a good FAQ.

Note that SSPL has not been approved by the OSI, and we do not refer to it as an Open Source license.

So all-in-all for the modules I think you're good.

I believe the spirit of this licensing approach is don't be evil/mean :)

For us nerds that are just getting started this is how we learn. The payback is one day when our product has enough customers, enough enterprise need, and we don't have time to maintain infrastructure we will head on over to the cloud offering.

Even better yet and this has happened at my day job. I love a product so much that I advocate for it in the enterprise setting. I bet you that happens a whole lot across IT's all over the world. I know this thing let's use it.

Lastly, I think if anything this falls under the SSPLv1 easily. Effectively no major cloud provider is going to use it in that way because they wouldn't keep it public and transparent of what they are doing. Well played.

xtianus79 commented 7 months ago

@javsalgar Hi I spoke to redis and they said if you wanted to ask any question or speak with their team they would be glad to set that up.

xtianus79 commented 6 months ago

Updated to specific version for RediSearch

# Stage 2: Build environment for RediSearch
FROM debian:bullseye AS redisearch-build-env

# Install Python, build tools, CMake, and git
RUN apt-get update && \
    apt-get install -y python3 python3-pip python3-venv build-essential libboost-all-dev cmake git && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Clone and build RedisSearch
WORKDIR /build/redisearch
RUN git clone --recursive --branch v2.8.12 https://github.com/RediSearch/RediSearch.git . && \
    python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip setuptools wheel && \
    pip install conan && \
    make
...

USER root
RUN apt-get update && \
    apt-get install -y ca-certificates curl libgomp1 libssl3 procps && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

...

RUN chmod +x /opt/bitnami/scripts/redis/postunpack.sh && /opt/bitnami/scripts/redis/postunpack.sh
xtianus79 commented 6 months ago

@javsalgar hi. I wanted to ask are you guys still going to be able to distribute redis 2.4+?

javsalgar commented 6 months ago

Hi!

Yes, the new license allows us to redistribute it.

carrodher commented 5 months ago

Hi, unfortunately after an internal review, and due to other priorities, this solution was not considered to be added to the catalog in the short/mid-term.

We apologize for the inconvenience. We will be sure to reconsider it in the future.

Thanks for your suggestion!