feat: add support to ctransformers

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

To add support for ctransformers, you can modify the build command in your Dockerfile to include the ctransformers package. You can use the following command to install ctransformers:

RUN pip install ctransformers==3.0.1

Replace 3.0.1 with the version of ctransformers that you want to use. To use ctransforters with cuBLAS, you can modify your model config file to include the following options:

gpu_layers: 1
ffmpeg: true
build_type: 'cublas'
cuda_major_version: 12
cuda_minor_version: 1
platforms: 'linux/amd64'
tag_latest: false
tag_suffix: '-cublas-cuda12-ffmpeg'

Replace the values with your desired settings. To use ctransforters with Metal, you can modify your model config file to include the following options:

gpu_layers: 1
build_type: 'metal'
platforms: 'macos/arm64'
tag_latest: false
tag_suffix: '-metal-cuda12-ffmpeg'

Replace the values with your desired settings. To use ctransforters with OpenBLAS, you can modify your model config file to include the following options:

gpu_layers: 1
build_type: 'openblas'
platforms: 'linux/amd64'
tag_latest: false
tag_suffix: '-openblas-cuda12-ffmpeg'

Replace the values with your desired settings. To use ctransforters with CuBLAS, you can modify your model config file to include the following options:

gpu_layers: 1
build_type: 'cublas'
cuda_major_version: 12
cuda_minor_version: 1
platforms: 'linux/amd64'
tag_latest: false
tag_suffix: '-cublas-cuda12-ffmpeg'

Replace the values with your desired settings. To use ctransforters with CLBlast, you can modify your model config file to include the following options:

gpu_layers: 1
build_type: 'clblas'
platforms: 'linux/amd64'
tag_latest: false
tag_suffix: '-clblas-cuda12-ffmpeg'

Replace the values with your desired settings. Make sure to give enough resources to the running container by setting the appropriate environment variables in your Dockerfile or Docker-compose file. You can use the following command to set the memory limit:

ENV memory_limit=2048m

Replace 2048m with the desired memory limit in megabytes.

Sources:

mudler / LocalAI

feat: add support to ctransformers #1056

:warning::warning::warning::warning::warning:

:warning::warning::warning::warning::warning: