llama-cpp-python errors / Better Documentation

Following the linux guide but it's incomplete. https://github.com/h2oai/h2ogpt/blob/main/docs/README_LINUX.md

conda create -n h2ogpt -y
conda activate h2ogpt
mamba install python=3.10 -c conda-forge -y
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 23.1.0
  latest version: 23.7.2

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=23.7.2

## Package Plan ##

  environment location: /home/jettscythe/miniconda3/envs/h2ogpt

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate h2ogpt
#
# To deactivate an active environment, use
#
#     $ conda deactivate

zsh: command not found: mamba

fixed with conda install -c conda-forge mamba

Now I'm hung up on building llama-cpp-python with GPU support.

Steps I've taken: mamba install python=3.10 -c conda-forge -y

conda install cudatoolkit-dev -c conda-forge -y
export CUDA_HOME=$CONDA_PREFIX

python --version
Python 3.10.12
which python
/home/jettscythe/miniconda3/envs/h2ogpt/bin/python
which pip 
/home/jettscythe/miniconda3/envs/h2ogpt/bin/pip
which pip3 
/home/jettscythe/miniconda3/envs/h2ogpt/bin/pip3
export CUDA_HOME=$CONDA_PREFIX
echo $CUDA_HOME
/home/jettscythe/miniconda3/envs/h2ogpt
pip uninstall -y pandoc pypandoc pypandoc-binary
WARNING: Skipping pandoc as it is not installed.
WARNING: Skipping pypandoc as it is not installed.
WARNING: Skipping pypandoc-binary as it is not installed.

pip install -r requirements.txt --extra-index https://download.pytorch.org/whl/cu117 yay tesseract python -m nltk.downloader all pip uninstall -y auto-gptq ; GITHUB_ACTIONS=true pip install auto-gptq --no-cache-dir pip uninstall -y exllama ; pip install https://github.com/jllllll/exllama/releases/download/0.0.8/exllama-0.0.8+cu117-cp310-cp310-linux_x86_64.whl --no-cache-dir

then, first error after

sp=`python -c 'import site; print(site.getsitepackages()[0])'`
sed -i 's/posthog\.capture/return\n            posthog.capture/' $sp/chromadb/telemetry/posthog.py

is "sed: can't read /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/chromadb/telemetry/posthog.py: No such file or directory"

After that I run

pip uninstall -y llama-cpp-python                        2 ✘ ╱ h2ogpt  ╱ at 02:39:27 PM  
export LLAMA_CUBLAS=1
export CMAKE_ARGS=-DLLAMA_CUBLAS=on
export FORCE_CMAKE=1
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.68 --no-cache-dir --verbose

which gives this log / error

Using pip 23.2.1 from /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip (python 3.10)
Collecting llama-cpp-python==0.1.68
  Downloading llama_cpp_python-0.1.68.tar.gz (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 7.8 MB/s eta 0:00:00
  Running command pip subprocess to install build dependencies
  Collecting setuptools>=42
    Obtaining dependency information for setuptools>=42 from https://files.pythonhosted.org/packages/c7/42/be1c7bbdd83e1bfb160c94b9cafd8e25efc7400346cf7ccdbdb452c467fa/setuptools-68.0.0-py3-none-any.whl.metadata
    Using cached setuptools-68.0.0-py3-none-any.whl.metadata (6.4 kB)
  Collecting scikit-build>=0.13
    Obtaining dependency information for scikit-build>=0.13 from https://files.pythonhosted.org/packages/fa/af/b3ef8fe0bb96bf7308e1f9d196fc069f0c75d9c74cfaad851e418cc704f4/scikit_build-0.17.6-py3-none-any.whl.metadata
    Using cached scikit_build-0.17.6-py3-none-any.whl.metadata (14 kB)
  Collecting cmake>=3.18
    Obtaining dependency information for cmake>=3.18 from https://files.pythonhosted.org/packages/14/b8/06f8fdc4687af3d3d8d95461d97737df2f144acd28eff65a3c47c29d0152/cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
    Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
  Collecting ninja
    Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
  Collecting distro (from scikit-build>=0.13)
    Using cached distro-1.8.0-py3-none-any.whl (20 kB)
  Collecting packaging (from scikit-build>=0.13)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting tomli (from scikit-build>=0.13)
    Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
  Collecting wheel>=0.32.0 (from scikit-build>=0.13)
    Obtaining dependency information for wheel>=0.32.0 from https://files.pythonhosted.org/packages/17/11/f139e25018ea2218aeedbedcf85cd0dd8abeed29a38ac1fda7f5a8889382/wheel-0.41.0-py3-none-any.whl.metadata
    Using cached wheel-0.41.0-py3-none-any.whl.metadata (2.2 kB)
  Using cached setuptools-68.0.0-py3-none-any.whl (804 kB)
  Using cached scikit_build-0.17.6-py3-none-any.whl (84 kB)
  Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.0 MB)
  Using cached wheel-0.41.0-py3-none-any.whl (64 kB)
  Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
  Successfully installed cmake-3.27.0 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.6 setuptools-68.0.0 tomli-2.0.1 wheel-0.41.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  running egg_info
  writing llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to llama_cpp_python.egg-info/requires.txt
  writing top-level names to llama_cpp_python.egg-info/top_level.txt
  reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  running dist_info
  creating /tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info
  writing /tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/requires.txt
  writing top-level names to /tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/top_level.txt
  writing manifest file '/tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file '/tmp/pip-modern-metadata-05v6mf65/llama_cpp_python.egg-info/SOURCES.txt'
  creating '/tmp/pip-modern-metadata-05v6mf65/llama_cpp_python-0.1.68.dist-info'
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (1.24.3)
Collecting diskcache>=5.6.1 (from llama-cpp-python==0.1.68)
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 kB 159.4 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)

  --------------------------------------------------------------------------------
  -- Trying 'Ninja' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.

  Not searching for unused variables given on the command line.

  -- The C compiler identification is GNU 13.1.1
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /sbin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is GNU 13.1.1
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /sbin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done (0.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja' generator - success
  --------------------------------------------------------------------------------

  Configuring Project
    Working directory:
      /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893/_skbuild/linux-x86_64-3.10/cmake-build
    Command:
      /tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on

  Not searching for unused variables given on the command line.
  -- The C compiler identification is GNU 13.1.1
  -- The CXX compiler identification is GNU 13.1.1
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /sbin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /sbin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /sbin/git (found version "2.41.0")
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:114 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- Found CUDAToolkit: /home/jettscythe/miniconda3/envs/h2ogpt/include (found version "11.7.64")
  -- cuBLAS found
  CMake Error at /tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:756 (message):
    Compiling the CUDA compiler identification source file
    "CMakeCUDACompilerId.cu" failed.

    Compiler: /home/jettscythe/miniconda3/envs/h2ogpt/bin/nvcc

    Build flags:

    Id flags: --keep;--keep-dir;tmp -v

    The output was:

    1

    #$ _NVVM_BRANCH_=nvvm

    #$ _SPACE_=

    #$ _CUDART_=cudart

    #$ _HERE_=/home/jettscythe/miniconda3/envs/h2ogpt/bin

    #$ _THERE_=/home/jettscythe/miniconda3/envs/h2ogpt/bin

    #$ _TARGET_SIZE_=

    #$ _TARGET_DIR_=

    #$ _TARGET_SIZE_=64

    #$ TOP=/home/jettscythe/miniconda3/envs/h2ogpt/bin/..

    #$
    NVVMIR_LIBRARY_DIR=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../nvvm/libdevice

    #$ LD_LIBRARY_PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../lib:

    #$
    PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../nvvm/bin:/home/jettscythe/miniconda3/envs/h2ogpt/bin:/tmp/pip-build-env-qzns_bbq/overlay/bin:/tmp/pip-build-env-qzns_bbq/normal/bin:/home/jettscythe/miniconda3/envs/h2ogpt/bin:/home/jettscythe/miniconda3/condabin:/usr/share/pyenv/plugins/pyenv-virtualenv/shims:/home/jettscythe/.pyenv/shims:/opt/flutter/bin:/home/jettscythe/.local/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/jettscythe/.local/share/gem/ruby/3.0.0/bin

    #$ INCLUDES="-I/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include"

    #$ LIBRARIES=
    "-L/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//lib64/stubs"
    "-L/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//lib64"

    #$ CUDAFE_FLAGS=

    #$ PTXAS_FLAGS=

    #$ rm tmp/a_dlink.reg.c

    #$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -E -x c++
    -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__
    "-I/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include"
    -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=7
    -D__CUDACC_VER_BUILD__=64 -D__CUDA_API_VER_MAJOR__=11
    -D__CUDA_API_VER_MINOR__=7 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
    "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
    "tmp/CMakeCUDACompilerId.cpp1.ii"

    In file included from
    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/cuda_runtime.h:83,

                     from <command-line>:

    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/crt/host_config.h:132:2:
    error: #error -- unsupported GNU version! gcc versions later than 11 are
    not supported! The nvcc flag '-allow-unsupported-compiler' can be used to
    override this version check; however, using an unsupported host compiler
    may cause compilation failure or incorrect run time execution.  Use at your
    own risk.

      132 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
          |  ^~~~~

    # --error 0x1 --

  Call Stack (most recent call first):
    /tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
    /tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
    /tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID)
    vendor/llama.cpp/CMakeLists.txt:244 (enable_language)

  -- Configuring incomplete, errors occurred!
  Traceback (most recent call last):
    File "/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 666, in setup
      env = cmkr.configure(
    File "/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 357, in configure
      raise SKBuildError(msg)

  An error occurred while configuring with CMake.
    Command:
      /tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-qzns_bbq/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on
    Source directory:
      /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893
    Working directory:
      /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893/_skbuild/linux-x86_64-3.10/cmake-build
  Please see CMake's output for more information.

  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpnq0oa956
  cwd: /tmp/pip-install-a_92ggee/llama-cpp-python_bca0aebdade24f0b8945bc89e83de893
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

then, first error after

sp=`python -c 'import site; print(site.getsitepackages()[0])'`
sed -i 's/posthog\.capture/return\n            posthog.capture/' $sp/chromadb/telemetry/posthog.py
is "sed: can't read /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/chromadb/telemetry/posthog.py: No such file or directory"

The documentation assumes your miniconda is installed into standard location. You can check where your miniconda is, and we can probably generalize the command a bit.

Following the linux guide but it's incomplete. https://github.com/h2oai/h2ogpt/blob/main/docs/README_LINUX.md

conda create -n h2ogpt -y
conda activate h2ogpt
mamba install python=3.10 -c conda-forge -y
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 23.1.0
  latest version: 23.7.2

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=23.7.2

## Package Plan ##

  environment location: /home/jettscythe/miniconda3/envs/h2ogpt

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate h2ogpt
#
# To deactivate an active environment, use
#
#     $ conda deactivate

zsh: command not found: mamba

I've never seen that error w.r.t. mamba. Nominally the guide is accurate.

I don't really understand this error. What is triggering zsh in first place?


    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/crt/host_config.h:132:2:
    error: #error -- unsupported GNU version! gcc versions later than 11 are
    not supported! The nvcc flag '-allow-unsupported-compiler' can be used to
    override this version check; however, using an unsupported host compiler
    may cause compilation failure or incorrect run time execution.  Use at your
    own risk.

      132 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
          |  ^~~~~

    # --error 0x1 --

What version of linux are you using? I haven't seen any issue w.r.t. C compiler version on Ubuntu 18, 20, or 22.

Following the linux guide but it's incomplete. https://github.com/h2oai/h2ogpt/blob/main/docs/README_LINUX.md

conda create -n h2ogpt -y
conda activate h2ogpt
mamba install python=3.10 -c conda-forge -y
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 23.1.0
  latest version: 23.7.2

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=23.7.2

## Package Plan ##

  environment location: /home/jettscythe/miniconda3/envs/h2ogpt

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate h2ogpt
#
# To deactivate an active environment, use
#
#     $ conda deactivate

zsh: command not found: mamba

I've never seen that error w.r.t. mamba. Nominally the guide is accurate.

I don't really understand this error. What is triggering zsh in first place?

zsh is my default shell.

miniconda is installed here: /home/jettscythe/miniconda3/

Running Arch Linux.

OS: Arch Linux x86_64 Host: 21D6004VUS (ThinkPad P16 Gen 1) Kernel: 6.4.6-arch1-1 Shell: zsh 5.9 CPU: 12th Gen Intel(R) Core(TM) i9-12900HX (24) @ 4.9 GHz GPU: NVIDIA RTX A3000 12GB Laptop GPU

gcc --version
gcc (GCC) 13.1.1 20230714

explains

error: #error -- unsupported GNU version! gcc versions later than 11 are
    not supported! The nvcc flag '-allow-unsupported-compiler' can be used to
    override this version check; however, using an unsupported host compiler
    may cause compilation failure or incorrect run time execution.  Use at your
    own risk.

tbh this is my first time using Conda - is this (the gcc version) something I can change in the env?

Ah, on mamba, just a typo in docs. mamba is faster than conda for installing packages, but I left this in:

  mamba install python=3.10 -c conda-forge -y

Will fix docs thanks.

For gcc version, please try this:

https://www.yodiw.com/solve-unsupported-gnu-version-gcc-versions-later-than-11-are-not-supported/

i.e.

MAX_GCC_VERSION=11

sudo apt install gcc-$MAX_GCC_VERSION g++-$MAX_GCC_VERSION
sudo ln -s /usr/bin/gcc-11 /usr/local/cuda/bin/

or something like that.

Building gcc11 now. Will report back. Thanks!

An alternative is:

    MAX_GCC_VERSION=11
    sudo apt install gcc-$MAX_GCC_VERSION g++-$MAX_GCC_VERSION
    sudo update-alternatives --config gcc
    # pick version 11
    sudo update-alternatives --config g++
    # pick version 11

Just as a couple notes for other Arch users that may come across this issue:

You can install older gcc versions from the AUR but there are no binary packages easily available which means you will need to build from source. You can speed this up using parallel compilation. To do this, edit /etc/makepkg.conf. Add MAKEFLAGS="-j$(nproc)", in my case around line 49.

debian / ubuntu's update-alternatives package is also available on the AUR

Still building on my end right now. So not entirely sure if this is a fix yet, but seems promising.

Still no luck. update-alternatives doesn't pick up my other gcc versions & despite the following command, it looks like nvcc still uses the system gcc/g++

pip uninstall -y llama-cpp-python
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11
export LLAMA_CUBLAS=1     
export CMAKE_ARGS=-DLLAMA_CUBLAS=on
export FORCE_CMAKE=1
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.68 --no-cache-dir --verbose

has the output:


WARNING: Skipping llama-cpp-python as it is not installed.
Using pip 23.2.1 from /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip (python 3.10)
Collecting llama-cpp-python==0.1.68
  Downloading llama_cpp_python-0.1.68.tar.gz (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 8.5 MB/s eta 0:00:00
  Running command pip subprocess to install build dependencies
  Collecting setuptools>=42
    Obtaining dependency information for setuptools>=42 from https://files.pythonhosted.org/packages/c7/42/be1c7bbdd83e1bfb160c94b9cafd8e25efc7400346cf7ccdbdb452c467fa/setuptools-68.0.0-py3-none-any.whl.metadata
    Using cached setuptools-68.0.0-py3-none-any.whl.metadata (6.4 kB)
  Collecting scikit-build>=0.13
    Obtaining dependency information for scikit-build>=0.13 from https://files.pythonhosted.org/packages/fa/af/b3ef8fe0bb96bf7308e1f9d196fc069f0c75d9c74cfaad851e418cc704f4/scikit_build-0.17.6-py3-none-any.whl.metadata
    Using cached scikit_build-0.17.6-py3-none-any.whl.metadata (14 kB)
  Collecting cmake>=3.18
    Obtaining dependency information for cmake>=3.18 from https://files.pythonhosted.org/packages/14/b8/06f8fdc4687af3d3d8d95461d97737df2f144acd28eff65a3c47c29d0152/cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
    Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
  Collecting ninja
    Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
  Collecting distro (from scikit-build>=0.13)
    Using cached distro-1.8.0-py3-none-any.whl (20 kB)
  Collecting packaging (from scikit-build>=0.13)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting tomli (from scikit-build>=0.13)
    Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
  Collecting wheel>=0.32.0 (from scikit-build>=0.13)
    Obtaining dependency information for wheel>=0.32.0 from https://files.pythonhosted.org/packages/17/11/f139e25018ea2218aeedbedcf85cd0dd8abeed29a38ac1fda7f5a8889382/wheel-0.41.0-py3-none-any.whl.metadata
    Using cached wheel-0.41.0-py3-none-any.whl.metadata (2.2 kB)
  Using cached setuptools-68.0.0-py3-none-any.whl (804 kB)
  Using cached scikit_build-0.17.6-py3-none-any.whl (84 kB)
  Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.0 MB)
  Using cached wheel-0.41.0-py3-none-any.whl (64 kB)
  Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
  Successfully installed cmake-3.27.0 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.6 setuptools-68.0.0 tomli-2.0.1 wheel-0.41.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  running egg_info
  writing llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to llama_cpp_python.egg-info/requires.txt
  writing top-level names to llama_cpp_python.egg-info/top_level.txt
  reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  running dist_info
  creating /tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info
  writing /tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/requires.txt
  writing top-level names to /tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/top_level.txt
  writing manifest file '/tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file '/tmp/pip-modern-metadata-k6sitm32/llama_cpp_python.egg-info/SOURCES.txt'
  creating '/tmp/pip-modern-metadata-k6sitm32/llama_cpp_python-0.1.68.dist-info'
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (1.24.3)
Collecting diskcache>=5.6.1 (from llama-cpp-python==0.1.68)
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 kB 60.8 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)

  --------------------------------------------------------------------------------
  -- Trying 'Ninja' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.

  Not searching for unused variables given on the command line.

  -- The C compiler identification is GNU 11.4.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/gcc-11 - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/g++-11 - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done (0.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja' generator - success
  --------------------------------------------------------------------------------

  Configuring Project
    Working directory:
      /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77/_skbuild/linux-x86_64-3.10/cmake-build
    Command:
      /tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on

  Not searching for unused variables given on the command line.
  -- The C compiler identification is GNU 11.4.0
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/gcc-11 - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/g++-11 - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /sbin/git (found version "2.41.0")
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:114 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- Found CUDAToolkit: /home/jettscythe/miniconda3/envs/h2ogpt/include (found version "11.7.64")
  -- cuBLAS found
  CMake Error at /tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:756 (message):
    Compiling the CUDA compiler identification source file
    "CMakeCUDACompilerId.cu" failed.

    Compiler: /home/jettscythe/miniconda3/envs/h2ogpt/bin/nvcc

    Build flags:

    Id flags: --keep;--keep-dir;tmp -v

    The output was:

    1

    #$ _NVVM_BRANCH_=nvvm

    #$ _SPACE_=

    #$ _CUDART_=cudart

    #$ _HERE_=/home/jettscythe/miniconda3/envs/h2ogpt/bin

    #$ _THERE_=/home/jettscythe/miniconda3/envs/h2ogpt/bin

    #$ _TARGET_SIZE_=

    #$ _TARGET_DIR_=

    #$ _TARGET_SIZE_=64

    #$ TOP=/home/jettscythe/miniconda3/envs/h2ogpt/bin/..

    #$
    NVVMIR_LIBRARY_DIR=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../nvvm/libdevice

    #$ LD_LIBRARY_PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../lib:

    #$
    PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../nvvm/bin:/home/jettscythe/miniconda3/envs/h2ogpt/bin:/tmp/pip-build-env-dquyh1ty/overlay/bin:/tmp/pip-build-env-dquyh1ty/normal/bin:/home/jettscythe/miniconda3/envs/h2ogpt/bin:/home/jettscythe/miniconda3/condabin:/usr/share/pyenv/plugins/pyenv-virtualenv/shims:/home/jettscythe/.pyenv/shims:/opt/flutter/bin:/home/jettscythe/.local/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/jettscythe/.local/share/gem/ruby/3.0.0/bin

    #$ INCLUDES="-I/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include"

    #$ LIBRARIES=
    "-L/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//lib64/stubs"
    "-L/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//lib64"

    #$ CUDAFE_FLAGS=

    #$ PTXAS_FLAGS=

    #$ rm tmp/a_dlink.reg.c

    #$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -E -x c++
    -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__
    "-I/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include"
    -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=7
    -D__CUDACC_VER_BUILD__=64 -D__CUDA_API_VER_MAJOR__=11
    -D__CUDA_API_VER_MINOR__=7 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
    "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
    "tmp/CMakeCUDACompilerId.cpp1.ii"

    In file included from
    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/cuda_runtime.h:83,

                     from <command-line>:

    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/crt/host_config.h:132:2:
    error: #error -- unsupported GNU version! gcc versions later than 11 are
    not supported! The nvcc flag '-allow-unsupported-compiler' can be used to
    override this version check; however, using an unsupported host compiler
    may cause compilation failure or incorrect run time execution.  Use at your
    own risk.

      132 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
          |  ^~~~~

    # --error 0x1 --

  Call Stack (most recent call first):
    /tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
    /tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
    /tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID)
    vendor/llama.cpp/CMakeLists.txt:244 (enable_language)

  -- Configuring incomplete, errors occurred!
  Traceback (most recent call last):
    File "/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 666, in setup
      env = cmkr.configure(
    File "/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 357, in configure
      raise SKBuildError(msg)

  An error occurred while configuring with CMake.
    Command:
      /tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-dquyh1ty/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on
    Source directory:
      /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77
    Working directory:
      /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77/_skbuild/linux-x86_64-3.10/cmake-build
  Please see CMake's output for more information.

  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmprp3ucc2o
  cwd: /tmp/pip-install-kca2jqpg/llama-cpp-python_a803465550f64933a69bd3f7536faa77
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

I've also tried

sudo ln -sfn /usr/bin/gcc-11 /opt/cuda/bin/gcc
sudo ln -sfn /usr/bin/g++-11 /opt/cuda/bin/g++

and

pip uninstall -y llama-cpp-python
export CC=/usr/bin/gcc-11
export CXX=/usr/bin/g++-11
export LLAMA_CUBLAS=1
export CMAKE_ARGS=-DLLAMA_CUBLAS=on
export FORCE_CMAKE=1
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 CC=/usr/bin/gcc-11 CXX=/usr/bin/g++-11 pip install llama-cpp-python==0.1.68 --no-cache-dir --verbose

as well as a bunch of variations of this. All with the same output:

WARNING: Skipping llama-cpp-python as it is not installed.
Using pip 23.2.1 from /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip (python 3.10)
Collecting llama-cpp-python==0.1.68
  Downloading llama_cpp_python-0.1.68.tar.gz (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 7.7 MB/s eta 0:00:00
  Running command pip subprocess to install build dependencies
  Collecting setuptools>=42
    Obtaining dependency information for setuptools>=42 from https://files.pythonhosted.org/packages/c7/42/be1c7bbdd83e1bfb160c94b9cafd8e25efc7400346cf7ccdbdb452c467fa/setuptools-68.0.0-py3-none-any.whl.metadata
    Using cached setuptools-68.0.0-py3-none-any.whl.metadata (6.4 kB)
  Collecting scikit-build>=0.13
    Obtaining dependency information for scikit-build>=0.13 from https://files.pythonhosted.org/packages/fa/af/b3ef8fe0bb96bf7308e1f9d196fc069f0c75d9c74cfaad851e418cc704f4/scikit_build-0.17.6-py3-none-any.whl.metadata
    Using cached scikit_build-0.17.6-py3-none-any.whl.metadata (14 kB)
  Collecting cmake>=3.18
    Obtaining dependency information for cmake>=3.18 from https://files.pythonhosted.org/packages/14/b8/06f8fdc4687af3d3d8d95461d97737df2f144acd28eff65a3c47c29d0152/cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
    Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
  Collecting ninja
    Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
  Collecting distro (from scikit-build>=0.13)
    Using cached distro-1.8.0-py3-none-any.whl (20 kB)
  Collecting packaging (from scikit-build>=0.13)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting tomli (from scikit-build>=0.13)
    Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
  Collecting wheel>=0.32.0 (from scikit-build>=0.13)
    Obtaining dependency information for wheel>=0.32.0 from https://files.pythonhosted.org/packages/17/11/f139e25018ea2218aeedbedcf85cd0dd8abeed29a38ac1fda7f5a8889382/wheel-0.41.0-py3-none-any.whl.metadata
    Using cached wheel-0.41.0-py3-none-any.whl.metadata (2.2 kB)
  Using cached setuptools-68.0.0-py3-none-any.whl (804 kB)
  Using cached scikit_build-0.17.6-py3-none-any.whl (84 kB)
  Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.0 MB)
  Using cached wheel-0.41.0-py3-none-any.whl (64 kB)
  Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
  Successfully installed cmake-3.27.0 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.6 setuptools-68.0.0 tomli-2.0.1 wheel-0.41.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  running egg_info
  writing llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to llama_cpp_python.egg-info/requires.txt
  writing top-level names to llama_cpp_python.egg-info/top_level.txt
  reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  running dist_info
  creating /tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info
  writing /tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/requires.txt
  writing top-level names to /tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/top_level.txt
  writing manifest file '/tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file '/tmp/pip-modern-metadata-gerqohs0/llama_cpp_python.egg-info/SOURCES.txt'
  creating '/tmp/pip-modern-metadata-gerqohs0/llama_cpp_python-0.1.68.dist-info'
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (1.24.3)
Collecting diskcache>=5.6.1 (from llama-cpp-python==0.1.68)
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 kB 41.0 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)

  --------------------------------------------------------------------------------
  -- Trying 'Ninja' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.

  Not searching for unused variables given on the command line.

  -- The C compiler identification is GNU 11.4.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/gcc-11 - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/g++-11 - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done (0.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja' generator - success
  --------------------------------------------------------------------------------

  Configuring Project
    Working directory:
      /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed/_skbuild/linux-x86_64-3.10/cmake-build
    Command:
      /tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on

  Not searching for unused variables given on the command line.
  -- The C compiler identification is GNU 11.4.0
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/gcc-11 - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/g++-11 - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /sbin/git (found version "2.41.0")
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:114 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- Found CUDAToolkit: /home/jettscythe/miniconda3/envs/h2ogpt/include (found version "11.7.64")
  -- cuBLAS found
  CMake Error at /tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:756 (message):
    Compiling the CUDA compiler identification source file
    "CMakeCUDACompilerId.cu" failed.

    Compiler: /home/jettscythe/miniconda3/envs/h2ogpt/bin/nvcc

    Build flags:

    Id flags: --keep;--keep-dir;tmp -v

    The output was:

    1

    #$ _NVVM_BRANCH_=nvvm

    #$ _SPACE_=

    #$ _CUDART_=cudart

    #$ _HERE_=/home/jettscythe/miniconda3/envs/h2ogpt/bin

    #$ _THERE_=/home/jettscythe/miniconda3/envs/h2ogpt/bin

    #$ _TARGET_SIZE_=

    #$ _TARGET_DIR_=

    #$ _TARGET_SIZE_=64

    #$ TOP=/home/jettscythe/miniconda3/envs/h2ogpt/bin/..

    #$
    NVVMIR_LIBRARY_DIR=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../nvvm/libdevice

    #$ LD_LIBRARY_PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../lib:

    #$
    PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/../nvvm/bin:/home/jettscythe/miniconda3/envs/h2ogpt/bin:/tmp/pip-build-env-hl1uo_1g/overlay/bin:/tmp/pip-build-env-hl1uo_1g/normal/bin:/home/jettscythe/miniconda3/envs/h2ogpt/bin:/home/jettscythe/miniconda3/condabin:/usr/share/pyenv/plugins/pyenv-virtualenv/shims:/home/jettscythe/.pyenv/shims:/opt/flutter/bin:/home/jettscythe/.local/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/jettscythe/.local/share/gem/ruby/3.0.0/bin

    #$ INCLUDES="-I/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include"

    #$ LIBRARIES=
    "-L/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//lib64/stubs"
    "-L/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//lib64"

    #$ CUDAFE_FLAGS=

    #$ PTXAS_FLAGS=

    #$ rm tmp/a_dlink.reg.c

    #$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -E -x c++
    -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__
    "-I/home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include"
    -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=7
    -D__CUDACC_VER_BUILD__=64 -D__CUDA_API_VER_MAJOR__=11
    -D__CUDA_API_VER_MINOR__=7 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
    "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
    "tmp/CMakeCUDACompilerId.cpp1.ii"

    In file included from
    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/cuda_runtime.h:83,

                     from <command-line>:

    /home/jettscythe/miniconda3/envs/h2ogpt/bin/..//include/crt/host_config.h:132:2:
    error: #error -- unsupported GNU version! gcc versions later than 11 are
    not supported! The nvcc flag '-allow-unsupported-compiler' can be used to
    override this version check; however, using an unsupported host compiler
    may cause compilation failure or incorrect run time execution.  Use at your
    own risk.

      132 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
          |  ^~~~~

    # --error 0x1 --

  Call Stack (most recent call first):
    /tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
    /tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
    /tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID)
    vendor/llama.cpp/CMakeLists.txt:244 (enable_language)

  -- Configuring incomplete, errors occurred!
  Traceback (most recent call last):
    File "/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 666, in setup
      env = cmkr.configure(
    File "/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 357, in configure
      raise SKBuildError(msg)

  An error occurred while configuring with CMake.
    Command:
      /tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-hl1uo_1g/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on
    Source directory:
      /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed
    Working directory:
      /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed/_skbuild/linux-x86_64-3.10/cmake-build
  Please see CMake's output for more information.

  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpqx1fwmje
  cwd: /tmp/pip-install-k0l05hdp/llama-cpp-python_1ba2f3aaf58a4263bdce411d45c600ed
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

Thanks for trying. @achraf-mer any idea how to deal with this?

I'm trying with conda install -c conda-forge gcc=11.4 & conda install -c conda-forge gxx=11.4 - which looks like it's making some progress.

New log:

WARNING: Skipping llama-cpp-python as it is not installed.
Using pip 23.2.1 from /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip (python 3.10)
Collecting llama-cpp-python==0.1.68
  Downloading llama_cpp_python-0.1.68.tar.gz (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 2.1 MB/s eta 0:00:00
  Running command pip subprocess to install build dependencies
  Collecting setuptools>=42
    Obtaining dependency information for setuptools>=42 from https://files.pythonhosted.org/packages/c7/42/be1c7bbdd83e1bfb160c94b9cafd8e25efc7400346cf7ccdbdb452c467fa/setuptools-68.0.0-py3-none-any.whl.metadata
    Using cached setuptools-68.0.0-py3-none-any.whl.metadata (6.4 kB)
  Collecting scikit-build>=0.13
    Obtaining dependency information for scikit-build>=0.13 from https://files.pythonhosted.org/packages/fa/af/b3ef8fe0bb96bf7308e1f9d196fc069f0c75d9c74cfaad851e418cc704f4/scikit_build-0.17.6-py3-none-any.whl.metadata
    Using cached scikit_build-0.17.6-py3-none-any.whl.metadata (14 kB)
  Collecting cmake>=3.18
    Obtaining dependency information for cmake>=3.18 from https://files.pythonhosted.org/packages/14/b8/06f8fdc4687af3d3d8d95461d97737df2f144acd28eff65a3c47c29d0152/cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
    Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
  Collecting ninja
    Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
  Collecting distro (from scikit-build>=0.13)
    Using cached distro-1.8.0-py3-none-any.whl (20 kB)
  Collecting packaging (from scikit-build>=0.13)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting tomli (from scikit-build>=0.13)
    Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
  Collecting wheel>=0.32.0 (from scikit-build>=0.13)
    Obtaining dependency information for wheel>=0.32.0 from https://files.pythonhosted.org/packages/17/11/f139e25018ea2218aeedbedcf85cd0dd8abeed29a38ac1fda7f5a8889382/wheel-0.41.0-py3-none-any.whl.metadata
    Using cached wheel-0.41.0-py3-none-any.whl.metadata (2.2 kB)
  Using cached setuptools-68.0.0-py3-none-any.whl (804 kB)
  Using cached scikit_build-0.17.6-py3-none-any.whl (84 kB)
  Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.0 MB)
  Using cached wheel-0.41.0-py3-none-any.whl (64 kB)
  Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
  Successfully installed cmake-3.27.0 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.6 setuptools-68.0.0 tomli-2.0.1 wheel-0.41.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  running egg_info
  writing llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to llama_cpp_python.egg-info/requires.txt
  writing top-level names to llama_cpp_python.egg-info/top_level.txt
  reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  running dist_info
  creating /tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info
  writing /tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/requires.txt
  writing top-level names to /tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/top_level.txt
  writing manifest file '/tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file '/tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python.egg-info/SOURCES.txt'
  creating '/tmp/pip-modern-metadata-wvsi8ckc/llama_cpp_python-0.1.68.dist-info'
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages (from llama-cpp-python==0.1.68) (1.24.3)
Collecting diskcache>=5.6.1 (from llama-cpp-python==0.1.68)
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 kB 5.4 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)

  --------------------------------------------------------------------------------
  -- Trying 'Ninja' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.

  Not searching for unused variables given on the command line.

  -- The C compiler identification is GNU 11.4.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/gcc-11 - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/g++-11 - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done (0.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja' generator - success
  --------------------------------------------------------------------------------

  Configuring Project
    Working directory:
      /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027/_skbuild/linux-x86_64-3.10/cmake-build
    Command:
      /tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPYTHON_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPYTHON_LIBRARY:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/lib/libpython3.10.so -DPython_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DPython3_EXECUTABLE:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 -DPython3_ROOT_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/home/jettscythe/miniconda3/envs/h2ogpt/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on

  Not searching for unused variables given on the command line.
  -- The C compiler identification is GNU 11.4.0
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /usr/bin/gcc-11 - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/g++-11 - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /sbin/git (found version "2.41.0")
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:114 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- Found CUDAToolkit: /home/jettscythe/miniconda3/envs/h2ogpt/include (found version "11.7.64")
  -- cuBLAS found
  -- The CUDA compiler identification is NVIDIA 11.7.64
  -- Detecting CUDA compiler ABI info
  -- Detecting CUDA compiler ABI info - done
  -- Check for working CUDA compiler: /home/jettscythe/miniconda3/envs/h2ogpt/bin/nvcc - skipped
  -- Detecting CUDA compile features
  -- Detecting CUDA compile features - done
  -- Using CUDA architectures: 52
  -- CMAKE_SYSTEM_PROCESSOR: x86_64
  -- x86 detected
  -- Configuring done (1.9s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027/_skbuild/linux-x86_64-3.10/cmake-build
  [1/8] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o
  [2/8] Building CUDA object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-cuda.cu.o
  [3/8] Building CXX object vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o
  [4/8] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o
  [5/8] Linking CXX shared library vendor/llama.cpp/libllama.so
  FAILED: vendor/llama.cpp/libllama.so
  : && /usr/bin/g++-11 -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libllama.so -o vendor/llama.cpp/libllama.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-cuda.cu.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -L/home/jettscythe/miniconda3/envs/h2ogpt/lib64/stubs   -L/home/jettscythe/miniconda3/envs/h2ogpt/lib64   -L/home/jettscythe/miniconda3/envs/h2ogpt/lib/gcc/x86_64-conda-linux-gnu/11.4.0   -L/home/jettscythe/miniconda3/envs/h2ogpt/lib/gcc   -L/home/jettscythe/miniconda3/envs/h2ogpt/x86_64-conda-linux-gnu/lib   -L/home/jettscythe/miniconda3/envs/h2ogpt/lib   -L/home/jettscythe/miniconda3/envs/h2ogpt/x86_64-conda-linux-gnu/sysroot/lib   -L/home/jettscythe/miniconda3/envs/h2ogpt/x86_64-conda-linux-gnu/sysroot/usr/lib -Wl,-rpath,/home/jettscythe/miniconda3/envs/h2ogpt/lib:  /home/jettscythe/miniconda3/envs/h2ogpt/lib/libcudart.so  /home/jettscythe/miniconda3/envs/h2ogpt/lib/libcublas.so  /home/jettscythe/miniconda3/envs/h2ogpt/lib/libcublasLt.so  /home/jettscythe/miniconda3/envs/h2ogpt/lib/libculibos.a  -lcudadevrt  -lcudart_static  -lrt  -lpthread  -ldl && :
  /sbin/ld: cannot find /usr/lib64/libpthread_nonshared.a: No such file or directory
  collect2: error: ld returned 1 exit status
  [6/8] Linking CUDA static library vendor/llama.cpp/libggml_static.a
  [7/8] Linking CUDA shared library vendor/llama.cpp/libggml_shared.so
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 674, in setup
      cmkr.make(make_args, install_target=cmake_install_target, env=env)
    File "/tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 697, in make
      self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
    File "/tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 742, in make_impl
      raise SKBuildError(msg)

  An error occurred while building with CMake.
    Command:
      /tmp/pip-build-env-octnsoyr/overlay/lib/python3.10/site-packages/cmake/data/bin/cmake --build . --target install --config Release --
    Install target:
      install
    Source directory:
      /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027
    Working directory:
      /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027/_skbuild/linux-x86_64-3.10/cmake-build
  Please check the install target is valid and see CMake's output for more information.

  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/jettscythe/miniconda3/envs/h2ogpt/bin/python3.10 /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmptti5r61f
  cwd: /tmp/pip-install-47yaxku1/llama-cpp-python_dcf84f65ef0a4e92acabe75bc6d74027
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

This error: /sbin/ld: cannot find /usr/lib64/libpthread_nonshared.a: No such file or directory makes sense in this context. locate libpthread_nonshared returns:

/home/jettscythe/miniconda3/envs/h2ogpt/x86_64-conda-linux-gnu/sysroot/usr/lib64/libpthread_nonshared.a
/home/jettscythe/miniconda3/pkgs/sysroot_linux-64-2.12-he073ed8_16/x86_64-conda-linux-gnu/sysroot/usr/lib64/libpthread_nonshared.a

Now how can I get ld to look in the right place? 🤔

I GOT IT!!! 🎉 🎉

sudo ln -sfn /home/jettscythe/miniconda3/envs/h2ogpt/x86_64-conda-linux-gnu/sysroot/usr/lib64/libpthread_nonshared.a /usr/lib64/libpthread_nonshared.a

pip uninstall -y llama-cpp-python
export LLAMA_CUBLAS=1
export CMAKE_ARGS=-DLLAMA_CUBLAS=on
export FORCE_CMAKE=1
DCMAKE_CUDA_COMPILER=$(which nvcc) CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.68 --no-cache-dir --verbose

gives:

Successfully built llama-cpp-python
Installing collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.1 llama-cpp-python-0.1.68

mamba issue mentioned in OP fixed 0cfd8a5d8c2e2b0ee737b6c6425ba1f943b58ff4

wellllll crap.

python generate.py --base_model=TheBloke/Llama-2-7b-Chat-GPTQ --load_gptq="gptq_model-4bit-128g" --use_safetensors=True --prompt_type=llama2 --save_dir='save'
Auto set langchain_mode=LLM.  Could use MyData instead.  To allow UserData to pull files from disk, set user_path or langchain_mode_paths, and ensure allow_upload_to_user_data=True
Using Model thebloke/llama-2-7b-chat-gptq
Prep: persist_directory=db_dir_UserData does not exist, regenerating
Did not generate db since no sources
Starting get_model: TheBloke/Llama-2-7b-Chat-GPTQ 
bin /home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
/home/jettscythe/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1714: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
device_map: {'': 0}
CUDA extension not installed.

I am very confused

python 
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.is_available())
True

output of python -m torch.utils.collect_env

Collecting environment information...
PyTorch version: 2.0.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Arch Linux (x86_64)
GCC version: (conda-forge gcc 11.4.0-0) 11.4.0
Clang version: 15.0.7
CMake version: version 3.27.0
Libc version: glibc-2.37

Python version: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] (64-bit runtime)
Python platform: Linux-6.4.6-arch1-1-x86_64-with-glibc2.37
Is CUDA available: True
CUDA runtime version: 11.7.64
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA RTX A3000 12GB Laptop GPU
Nvidia driver version: 535.86.05
cuDNN version: Probably one of the following:
/usr/lib/libcudnn.so.8.9.2
/usr/lib/libcudnn_adv_infer.so.8.9.2
/usr/lib/libcudnn_adv_train.so.8.9.2
/usr/lib/libcudnn_cnn_infer.so.8.9.2
/usr/lib/libcudnn_cnn_train.so.8.9.2
/usr/lib/libcudnn_ops_infer.so.8.9.2
/usr/lib/libcudnn_ops_train.so.8.9.2
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   39 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          24
On-line CPU(s) list:             0-23
Vendor ID:                       GenuineIntel
Model name:                      12th Gen Intel(R) Core(TM) i9-12900HX
CPU family:                      6
Model:                           151
Thread(s) per core:              2
Core(s) per socket:              16
Socket(s):                       1
Stepping:                        2
CPU(s) scaling MHz:              55%
CPU max MHz:                     5000.0000
CPU min MHz:                     800.0000
BogoMIPS:                        4993.00
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabilities
Virtualization:                  VT-x
L1d cache:                       640 KiB (16 instances)
L1i cache:                       768 KiB (16 instances)
L2 cache:                        14 MiB (10 instances)
L3 cache:                        30 MiB (1 instance)
NUMA node(s):                    1
NUMA node0 CPU(s):               0-23
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.3
[pip3] torch==2.0.1+cu117
[pip3] torchvision==0.15.2
[conda] cudatoolkit-dev           11.7.0               h1de0b5d_6    conda-forge
[conda] numpy                     1.24.3                   pypi_0    pypi
[conda] torch                     2.0.1+cu117              pypi_0    pypi
[conda] torchvision               0.15.2                   pypi_0    pypi

It does seem like

python generate.py --base_model=TheBloke/Llama-2-7b-Chat-GPTQ --load_gptq="gptq_model-4bit-128g" --use_safetensors=True --prompt_type=llama2 --save_dir='save' --load_exllama=True --revision=gptq-4bit-32g-actorder_True --rope_scaling="{'alpha_value':4}"

Is working with CUDA.

Auto set langchain_mode=LLM.  Could use MyData instead.  To allow UserData to pull files from disk, set user_path or langchain_mode_paths, and ensure allow_upload_to_user_data=True
Using Model thebloke/llama-2-7b-chat-gptq
Prep: persist_directory=db_dir_UserData does not exist, regenerating
Did not generate db since no sources
Starting get_model: TheBloke/Llama-2-7b-Chat-GPTQ 
Automatically setting max_seq_len=8192 for RoPE scaling
Downloading (…)b06239e96013b/Notice: 100%|█████████████████████████████████████████████████████████| 112/112 [00:00<00:00, 461kB/s]
Downloading (…)6013b/.gitattributes: 100%|████████████████████████████████████████████████████| 1.52k/1.52k [00:00<00:00, 7.43MB/s]
Downloading (…)quantize_config.json: 100%|████████████████████████████████████████████████████████| 183/183 [00:00<00:00, 1.40MB/s]
Downloading (…)neration_config.json: 100%|████████████████████████████████████████████████████████| 137/137 [00:00<00:00, 1.05MB/s]
Downloading (…)239e96013b/README.md: 100%|████████████████████████████████████████████████████| 20.1k/20.1k [00:00<00:00, 78.8MB/s]
Downloading (…)06239e96013b/LICENSE: 100%|████████████████████████████████████████████████████| 50.3k/50.3k [00:00<00:00, 2.25MB/s]
Downloading (…)4bit-32g.safetensors: 100%|████████████████████████████████████████████████████| 4.28G/4.28G [05:12<00:00, 13.7MB/s]
Fetching 12 files: 100%|███████████████████████████████████████████████████████████████████████████| 12/12 [05:12<00:00, 26.07s/it]
Model {'base_model': 'TheBloke/Llama-2-7b-Chat-GPTQ', 'tokenizer_base_model': '', 'lora_weights': '', 'inference_server': '', 'prompt_type': 'llama2', 'prompt_dict': {'promptA': '', 'promptB': '', 'PreInstruct': '<s>[INST] ', 'PreInput': None, 'PreResponse': '[/INST]', 'terminate_response': ['[INST]', '</s>'], 'chat_sep': ' ', 'chat_turn_sep': ' </s>', 'humanstr': '[INST]', 'botstr': '[/INST]', 'generates_leading_space': False}}
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Did not generate db since no sources

Ok, just be careful with RoPE scaling. It's experimental.

@JettScythe I updated h2oGPT install linux/windows docs after finding jlllll also compiles llama_cpp_python.

You can try that pre-compiled wheel mentioned, to save needing to compile yourself.

h2oai / h2ogpt

llama-cpp-python errors / Better Documentation #565