Open ParetoOptimalDev opened 6 months ago
I think maybe the api is just not working at all, nothing is listening on port 5000.
Can anyone else confirm otherwise?
@ParetoOptimalDev can you try out https://github.com/nixified-ai/flake/pull/71 and test if it works and doesn't break anything else? If it does, I'll go ahead and merge. Make sure to rm -rf ~/.textgen
to perform that test.
I've been debating removing text-generation-webui in favor of https://github.com/imartinez/privateGPT, as it will be easier to maintain and more straight forward to use for the basic tasks. Text-generation-webui has a lot of undeclared dependencies and magical behavior at runtime that will never be fully encapsulated by our effort here.
Also, as mentioned in other PRs, we need VM tests, as it will prevent these kinds of issues happening in the future.
I tried #71 last night and when textgen loaded it was stuck on a prompt of some sort. I'm pretty sure I deleted ~/.textgen
but not positive.
I'll test that again.
First though, since you mentioned moving to privateGPT, I'll test that. It should just be this right?
nix run github:MatthewCroughan/privateGPT
We'll see I guess, it's currently building.
Also, do you know if privateGPT supports exllamav2? I'm basically just trying to test perf improvments of llamav2 over openai compatible api, I don't really care what with.
@ParetoOptimalDev yeah I have a flake at that URL which you found, and it should work, let me know if it doesn't
@ParetoOptimalDev yeah I have a flake at that URL which you found, and it should work, let me know if it doesn't
I'll have to clear some space after all of these copies of cuda ;)
Text-generation-webui has a lot of undeclared dependencies and magical behavior at runtime that will never be fully encapsulated by our effort here
Can you give me some examples of these? I'm trying to get better at packaging python applications in Nix and this would be really valuable for me to understand.
Well a good example is that when it launches, it offers you the ability on the webui to choose between different models with different quantization methods, gguf, gptq, bitsandbytes. It doesn't ask for those during "install time" because it doesn't formally have a pyproject.toml or some machine-readable or parseable format that allows us to detect what it wants. Instead, it'll just allow the user to choose between these methods, and then obviously that has a dependency on packaging either llama-cpp, autogptq, or bitandbytes, which are complex programs to package in their own right. And so it will blow up at runtime since we didn't know and couldn't know it wanted these things in order to run.
It should be added that quantization methods are changing all the time, as well as the runtime dependencies, and it's very hard to keep track of the missing undeclared information over time as the upstream project develops. Whereas privateGPT actually has a pyproject.toml and poetry.lock that can be read and automatically parsed by Nix, or any other package manager for that matter.
Thanks!
Also, your privateGPT flake isn't working for me. It seems to not have any binaries and also the site_packages in the nix shell only has privateGPT:
$ nix run --refresh github:MatthewCroughan/privateGPT
error: unable to execute '/nix/store/a19cq9qjqwy0x433qvvd4si487cpikv5-python3.11-private-gpt-0.1.0/bin/private-gpt': No such file or directory
@ParetoOptimalDev it's a poetry2nix project, it's nix develop
and then you can python -m private_gpt
Hm... okay... trying that gets me:
Okay, so now I now the repo should be downloaded:
So then I looked at the settings.yaml for about 10 minutes before figuring out the path it probably wants.
Is there a way we can make it nix run
able? I don't think most users will be as persistent :sweat_smile:
github:MatthewCroughan/privateGPT
worked for me after I run python scripts/setup
.
Seems like privateGPT can only run a subset of GGUF models though, and its not clear which until you try them. But I do appreciate its better to have the dependencies specified properly.
I'm using 63339e4c8727578a0fe0f2c63865f60b6e800079.
Possibly related: