Better Support for AMD and ROCM via docker containers.

mudler / LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

https://localai.io

MIT License

21.59k stars 1.65k forks source link

Better Support for AMD and ROCM via docker containers. #1592

Open jamiemoller opened 5 months ago

jamiemoller commented 5 months ago

Presently it is very hard to get a docker container to build with the rocm backend, some elements seem to fail independently during the build process. There are other related projects with functional docker implementations that do work with rocm out of the box (aka llama.cpp). I would like to work on this myself however between the speed at which things change in this project and the amount of time I have free to work on this, I am left only to ask for this.

If there are good 'stable' methods for building a docker implementation with rocm underneath already it would be very appreciated if this could be better documented. 'arch' helps nobody that wants to run on a more enterprisy os like rhel or sles.

Presently I have defaulted back to using textgen as it has a mostly functional api but its featureset is kinda woeful. (better than running llama.cpp directly imo)

bunder2015 commented 2 weeks ago

Just gave v2.17.0-hipblas-ffmpeg a try, unfortunately I'm still getting the "found no nvidia driver" error with dreamshaper. 😢

jtwolfe commented 2 weeks ago

Just gave v2.17.0-hipblas-ffmpeg a try, unfortunately I'm still getting the "found no nvidia driver" error with dreamshaper. 😢

I replicated this with the aio/gpu-8g/image-gen.yaml example that i had tested functional previously, i did not test 2.16.0 prior to testing with 2.17.0

I also tested this with the aio/cpu/image-gen.yaml example and found a found the following error

3:43AM DBG GRPC Service state dir: /tmp/go-processmanager816316485
3:43AM DBG GRPC Service Started
3:43AM DBG GRPC(stablediffusion_assets-127.0.0.1:44957): stderr /tmp/localai/backend_data/backend-assets/grpc/stablediffusion: error while loading shared libraries: libomp.so: cannot open shared object file: No such file or directory
3:44AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:44957: connect: connection refused\""

bunder2015 commented 1 week ago

I've seen that libomp.so message a few times as well for other models, which is strange, because there is one inside the container... I'd have to guess its not linked correctly, or gets added too late in the build process?

FWIW, I just tried v2.17.1-hipblas-ffmpeg and still got the "found no nvidia driver" message there as well. For giggles I tried the binary build of 2.17.1 and I got ye olde "some backends require localai compiled with GO_TAGS"... 😂

The new sd3 model gave me "'BackendServicer' object has no attribute 'pipe'".

I'm okay with sticking to an older build for now, as long as models load (which might be a problem as new models get added, eg if they need newer versions of llama). I'm not really sure what else I can do to help at the moment, other than testing new builds, I'm a little hesitant on giving someone ssh access to my system given the elevated permissions needed to run docker containers. I had also considered buying someone a Radeon card but mudler's sponsor page says he doesn't need financial support. 🤷

Cheers

bunder2015 commented 1 week ago

I was doing my weekly browsing of phoronix, it appears that Fedora has been working on getting rocm/pytorch stuff working... I wonder if we'd be better off building the images off Fedora instead?

It also looks like there is a new version of rocm from AMD, not sure if it would help at all...

Cheers

jtwolfe commented 1 week ago

its true that EL distros do often provide more consistent libs, but id honestly be more interested in RHEL or SUSE base images since their actually on the official os support list for rocm. Big downside is that it would require a significant rework and a separation of the build workflows for cuda and rocm, probably not an issue if you use the 'local' build method but mudler might not be super psyched. Same with the version bump, good practice to let sleeping dogs lie, but if for some reason one of the dependences break or there are serious performance improvements I say stick with 6.0.0.... also 6 has good device compatibility for the moment... I fear the day my R-VII is fully depreciated.

bunder2015 commented 1 week ago

Fair enough, I'm already planning on getting off this hardware and going back to Intel+NVIDIA... This system has been a PITA ever since I built it, and rocm barely working is the final nail in the coffin for me.

mudler commented 1 week ago

I had also considered buying someone a Radeon card but mudler's sponsor page says he doesn't need financial support. 🤷

While is true that I have a job - I don't want that line to be misunderstood - I would appreciate donations in any case, that would motivate me to keep working on the projects and help me buying dedicated HW to test things out. As you can imagine there are various hardware equipment combinations - and while I do have Intel GPUs, I lack AMD gpus to test this out, and I can test only with what I have access to. Initially we got AMD support because things were working and we had testers, but of course in the long run the HW is required to keep maintaining code and functionalities, otherwise is a back and forth until things are working, which ends up to be frustrating.

For instance - I managed to get my hands on a remote box with an Nvidia GPUs thanks to our sponsors - which I use to test things out.

bunder2015 commented 1 week ago

Okay, let's do that then... I had a quick look online and I see some 16gb RX6800's going for ~200USD (prior to shipping). While I could cover that, it would really help if others could maybe pitch in as well. I'll hold on for a couple days if anyone wants to help out. :+1:

jtwolfe commented 1 week ago

@bunder2015 @mudler please note that neither rocm5 nor rocm6 officially support the 6k series chips. I'm not saying that they wont work, just make sure to consider this.

edit; This may be a good time to dig a bit further into Vulkan implementation, its definitely WAY more compatible and may be a good way of including arm chips with vulkan capable gpu cores. I'm specifically thinking about all of these new fancy arm laptops that everyone is producing now

bunder2015 commented 1 week ago

:face_exhaling: AMD's not making this easy... Thanks for the charts, I only suggested the 6 series because it was PCIE gen4 and not the latest (ie expensive) chips.

Looks like Radeon VII's are still out there for the same price bracket, even though it will be deprecated by rocm eventually.

Airradda commented 1 week ago

please note that neither rocm5 nor rocm6 officially support the 6k series chips.

Just for some farther clarification, this is only true for ROCm on Linux, they are supported for ROCm for Windows. Also out of personal experience with my 6950XT*, I have not had issues that I can pin on ROCm when trying to use anything that advertised ROCm support (Text-Gen-UI, Ollama, Llamafile, Comfy-UI, SD-Gen-UI, SD-Next, etc.), and even some that don't.

Edit: * So long as make sure they include in or I edit them to include in the docker compose file:

devices:
    - /dev/dri
    - /dev/kfd

jtwolfe commented 1 week ago

@Airradda TRUE! I forget that Windows exists sometimes :P The question would then be which approach is better? Use windows with older cards to host; or ensure uniform compatibility with newer cards on windows or linux platforms /shrug This should be continued as a background discussion, (eg for containers and k8s, if there was a windows-core based container then maybe this could work, they need a windows host tho and the k8s on windows problem is a whole other kettle of fish)

@bunder2015 R-VII are pretty solid options as long as you can get a good one (aka non-mining rig card, they were often configured at lower mem voltage and higher freqs and had a habit of hard-locking if not under load)

I don't begrudge AMD for their flippy-floppy nonsense and lack of support of older cards given how much has changed in their architecture recently, but it does seem like they kinda settled recently with whatever makes the gfx1100 llvm target work, I just hope RDNA3 is a little more extensible going forward so we get like a good 5y of support for current cards.

I expect that the reason they haven't been able to cap. more of the market share at their current price point is that CUDA has been so over developed that you could almost shake it up and have a fully functional OS fall out with all the spare code that's floating around in it. AMD have had to figure out what to do first, then how best to do it, and do it cheaper.... I don't envy them.

ps. wow @mudler that was quick XD

jtwolfe commented 1 week ago

please note that neither rocm5 nor rocm6 officially support the 6k series chips.

Just for some farther clarification, this is only true for ROCm on Linux, they are supported for ROCm for Windows. Also out of personal experience with my 6950XT*, I have not had issues that I can pin on ROCm when trying to use anything that advertised ROCm support (Text-Gen-UI, Ollama, Llamafile, Comfy-UI, SD-Gen-UI, SD-Next, etc.), and even some that don't.

Edit: * So long as make sure they include in or I edit them to include in the docker compose file:
devices:
    - /dev/dri
    - /dev/kfd

If you have time it would be appreciated if you could add to the documentation on compatibility <3

bunder2015 commented 1 week ago

I'll hold on for a couple days if anyone wants to help out. 👍

and sent. I hope that it's enough to get you something usable. Cheers