KoboldAI / KoboldAI-Client

For GGUF support, see KoboldCPP: https://github.com/LostRuins/koboldcpp
https://koboldai.com
GNU Affero General Public License v3.0
3.51k stars 758 forks source link

Update pytorch+rocm to latest Version #237

Closed waffshappen closed 1 year ago

waffshappen commented 1 year ago

Tested on a Steam Deck with HSA_OVERRIDE_GFX_VERSION=10.3.0 as env in the docker-compose.yml. Successfully loads and runs OPT up to 350m on the gpu, most layers on 1.3B and up to 14 Layers of 2.7B Models.

I did not observe any Breakage with the updated pytorch Version (Tested with Story and OPT, not with Adventure Models) but more widespread Tests might be required.

henk717 commented 1 year ago

Is there something the older pytorch prevents from happening? If not i'd rather see this pull on the version on my own Github so we can have it in the beta branch first. I am always a bit hesitant to update the official version's dependencies unless there is a need to do so.

waffshappen commented 1 year ago

Is there something the older pytorch prevents from happening?

Yes, mostly it working on the Steam Deck GPU (Fedora 37, Latest podman). It reliably works, for hours (tested with a script that kept hitting the submit Button to keep generating) with 5.2 but the older pytorch fails, usually on the first or second submission, with memory errors that are not consistently the same each run. (radeontop does not show any memory full issue, nor do any logs show memory running full).

Plus the package index does not show a pytorch1.12+rocm5.2 for me.

CPU only generation works reliably with both.

Sadly this is my only RDNA2 gpu i have to test it at all - and my 7900XTX disappointingly has been left in the dust by AMD with nothing but a possible hint of support at all in 5.5.0 so waiting to test that with a pytorch2+rocm5.5 nightly in the far far future.

0cc4m commented 1 year ago

I found another thing: When using two AMD GPUs, KoboldAI only outputs incoherent text when splitting a model across them. Using the up-to-date torch version of this PR fixes it and allows AMD users to use multiple GPUs together. @henk717

0cc4m commented 1 year ago

@waffshappen Latest is 1.13.1, any particular reason you chose 1.13.0? I tested the two GPU fix only with 1.13.1

waffshappen commented 1 year ago

@waffshappen Latest is 1.13.1, any particular reason you chose 1.13.0? I tested the two GPU fix only with 1.13.1

I did not see them when i initially made this MR, maybe they got added since?

ClashSAN commented 1 year ago

@waffshappen how interesting. how much vram would it take to run 350m? Did you increase allocated gpu vram to 4g in bios settings? Maybe you could write a gist if this isn't the right place, always cool to see people trying ML-related things with Steam Deck

MrGrymReaper commented 1 year ago

Maybe take it a bit further to version 5.5 of amd rocm and keep an eye out for the first releases if not already available for consumer GPUs.

0cc4m commented 1 year ago

@MrGrymReaper KoboldAI uses pytorch, we can switch to ROCm 5.5 once Pytorch does. So far they use 5.4.2

henk717 commented 1 year ago

Also worth noting on United we had regressions so Pytorch 2.0 was reverted. So hopefully when ROCm gets more stable we can reintroduce it there. Currently not comfortable with the change on the main version as a result, but United has the most recent one that was tested to be stable ready to use.