Confused on how to use with a local LLm (via Text Gen WebUI)

GamingDaveUk commented 8 months ago

Was using cras however that seems to be abandoned because of author time restraints (understandable), This one was mentioned in a comfyui livestream and since the Querry local LM node of the other no longer works with Text Gen WebUI, I figured i would give it a go. However I can not figure out how to use it, read through the readme, and may just not be understanding it but how do you send your own text prompt to the LLM and use the response as a string? just send string A, receive string B?

glibsonoran commented 8 months ago

I've tested the node with LM Studio, Oobaboonga and Koboldcpp, however not with Text Gen. But assuming Text Gen can work with the OpenAI API client too, here's how the node works (Just to be sure though, you're using the Advanced Prompt Enhancer, not style_prompt, right?):

Hook up the Instruction, Prompt etc inputs with text output nodes (e.g. a 'primitive' node).
Hook up the LLMPrompt output connection to a CLIP node that accepts a prompt. (Hook up a show text node to this output also if you want to see your generated prompt)
Attach a show text node to the troubleshooting output too so you can verify that everything worked and you can see which model was used from Text Gen.
You can skip all this setting up of the workflow by just using the example workflow: Advanced Prompt Enhancer and Dalle_e.png in the ...ComfyUI/custom_nodes/Plush-for-ComfyUI/Example_Workflows directory
Start up your LLM frontend (Text Gen) and get the API URL from either the terminal output or in the app's UI (I'm not familiar with Text Gen's UI). With some LLM front ends it will state that the URL is for "OpenAI compatible" API, that's the one you want. These URL's generally look like: http://localhost:5001/v1/ where 5001 is the port number and likely will be different for your app.
Select "Other LLM via URL" from the node's LLM field
Enter the URL you got from the terminal output or the app's UI into the LLM URL field
Type up your Instruction and/or Prompt and press Queue.
The node should send the instruction/prompt/examples to Text Gen. Text Gen will run the model and return the LLM model output which will be sent to CLIP (and a text output node if you attach one).

Let me know if this works, as I said I haven't tested it with Text Gen, there are just too many LLM front ends out there to test them all.

glibsonoran commented 8 months ago

OK, so looking for Text Gen WebUI it looks like that's often used as part of oobabooga. If that's what you're running your OpenAI compatible API URL should be: http://127.0.0.1:5000/v1 unless you've specified another port in your setup/settings. Try that value in the node's LLM URL field and select 'Other LLM via URL' in the node's LLM field and you should be good to go.

GamingDaveUk commented 8 months ago

Text gen webui is oobabooga, just easier on us dyslexics to remember the spelling lol.

At work now but will give it another try later on. Can I suggest a folder inside the repo with some example workflows?

I am sure reading through this on a oc screen instead of my phone will help me make more sense of it, but if I get the gist ... its string of your own prompt to the llm into the node, reply from the llm out of the node.

So String "You are a image description creator, describe the following image in 3 or 4 short sentences to a painter: Image your describing is: a city street at night with a steampunk asthetic, lit by gas powered lamps"

Then the out put would be the reply to that input from the llm....

I hate being at work and not being able to test it lol.

Thank you for the reply and thank you for making the node!

On Mon, 11 Mar 2024, 17:37 glibsonoran, @.***> wrote:

OK, so looking for Text Gen WebUI it looks like that's often used as part of oobabooga. If that's what you're running your OpenAI compatible API URL should be: http://127.0.0.1:5000/v1 unless you've specified another port in your setup/settings. If so put that value in the node's LLM URL field and select 'Other LLM via URL' in the node's LLM field and you should be good to go.

— Reply to this email directly, view it on GitHub https://github.com/glibsonoran/Plush-for-ComfyUI/issues/53#issuecomment-1989043404, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7JQMJKL7UOGIXVDNHU6MSTYXXTURAVCNFSM6AAAAABEQUBD4CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBZGA2DGNBQGQ . You are receiving this because you authored the thread.Message ID: @.***>

glibsonoran commented 8 months ago

You're very welcome Dave. :) There is a folder in both the repo and your installation with example workflows in it named, fittingly: Example_Workflows. ;)

Yes read this through on the bigger screen when you get home: Advanced Prompt Enhancer requires you write your own instructions/examples. style_prompt on the other hand, only requires your own prompt as it has built-in instructions.

I hope this works for you

GamingDaveUk commented 8 months ago

" There is a folder in both the repo and your installation with example workflows in it named, fittingly: Example_Workflows."

Ok, now I feel a bit of a wally lol (I swear I looked the otherday, just clearly not hard enough)

Thank you for the reply, I will look forward to trying it out later (and remember to wear my glasses lol!).

On Mon, Mar 11, 2024 at 6:45 PM glibsonoran @.***> wrote:

You're very welcome Dave. :) There is a folder in both the repo and your installation with example workflows in it named, fittingly: Example_Workflows.

Yes read this through on the bigger screen when you get home: Advanced Prompt Enhancer requires you write your own instructions/examples. style_prompt on the other hand, only requires your own prompt as it has built-in instructions.

— Reply to this email directly, view it on GitHub https://github.com/glibsonoran/Plush-for-ComfyUI/issues/53#issuecomment-1989193276, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7JQMJKBQ7EBYOTTLMB3IILYXX3UNAVCNFSM6AAAAABEQUBD4CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBZGE4TGMRXGY . You are receiving this because you authored the thread.Message ID: @.***>

GamingDaveUk commented 8 months ago

I loaded up the Example, and changed the model and the image(to keep the workflow the same, obviously I would not be using an image input, or examples normally) to valid files switched it to other LLM via URL and entered the url in. Ran it and got the following error in the console: got prompt

Failed to validate prompt for output 194:
* AdvPromptEnhancer 186:
  - Value not in list: GPTmodel: 'gpt-4-vision-preview' not in []
Output will be ignored
Failed to validate prompt for output 141:
Output will be ignored
Failed to validate prompt for output 192:
Output will be ignored
Failed to validate prompt for output 181:
Output will be ignored
Failed to validate prompt for output 193:
Output will be ignored
Failed to validate prompt for output 185:
Output will be ignored
[rgthree] Using rgthree's optimized recursive execution.
Prompt executed in 0.00 seconds

Since the error mentions GPTModel, I clicked on that to see what options are available. Once clicked on it switches to "undefined" Attempting to run it with that set to undefined gives the following error:

got prompt
Failed to validate prompt for output 194:
* AdvPromptEnhancer 186:
  - Required input is missing: GPTmodel
Output will be ignored
Failed to validate prompt for output 141:
Output will be ignored
Failed to validate prompt for output 192:
Output will be ignored
Failed to validate prompt for output 181:
Output will be ignored
Failed to validate prompt for output 193:
Output will be ignored
Failed to validate prompt for output 185:
Output will be ignored
[rgthree] Using rgthree's optimized recursive execution.
Prompt executed in 0.00 seconds

Not sure what I am doing wrong, followed the instructions provided first, had the same result, so tried the workflow This appears to be the right node, though its a lot more complex than the llm node I am used to(that may be a good thing once i suss out the node). Dont fully understand why it has an image input though?

What am i doing wrong?

glibsonoran commented 8 months ago

Hmm, first try brinring in a new Advanced Prompt Enhancer node. Then reconnect all the outputs and inputs from the old node to the new one. Finally delete the old node. This usually happens when a node's UI changes and you're accessing a workflow that was created with the old node.

glibsonoran commented 8 months ago

Also, for now, delete the image node "Load Image". The node can create text/prompt based on an image input, but that's just making things more complicated as we try to fix this right now.

GamingDaveUk commented 8 months ago

Same error. Going to have another look at it tomorrow, might be I have a conflicting node.

glibsonoran commented 8 months ago

This if for tomorrow: This doesn't look like a node conflict, so let's try this:

Start with a blank workflow.
Create a new Advanced Prompt Enhancer (APE) node
Create two 'Primitive' nodes and hook them up to the APE: 'Instruction' and 'Prompt' inputs.
Create two 'ShowText|pysssss' nodes and hook them up to the APE: 'LLMprompt' and 'Troublshooting' outputs
Start oobabooga
Write a simple instruction and prompt in the input nodes
Set your APE: 'LLM' to 'Other LLM...' and your 'LLM URL' to the url value you need.
Press Queue We'll simplify things and see if we get text output from the node and no errors in Troubleshooting.

GamingDaveUk commented 8 months ago

Still had it all loaded up, so gave that a test, this time theres an error prompt on the ui, which didnt have before... but alas the same error

glibsonoran commented 8 months ago

OK, I see the problem. All my testers and I had a ChatGPT API key and account. But this should work whether or not you have one. But it doesn't because it can't load the GPTmodel list without that. I will fix this in my code an upload a new version tonight. It should work for you tomorrow.

GamingDaveUk commented 8 months ago

OK, I see the problem. All my testers and I had a ChatGPT API key and account. But this should work whether or not you have one. But it doesn't because it can't load the GPTmodel list without that. I will fix this in my code an upload a new version tonight. It should work for you tomorrow.

Cool, much appreciated, thank you.

glibsonoran commented 8 months ago

OK, the new version is up. I hope this fixes things for you. Please let me know.

Thanx

GamingDaveUk commented 8 months ago

OK, the new version is up. I hope this fixes things for you. Please let me know.

Thanx

Seems to be working perfectly, I just need to get used to the new format for prompting the ai. Thank you for the fast fix, much appreciated.

glibsonoran commented 8 months ago

I'm glad to hear it's working for you now. Thank you for letting me know about the issue, I'm sure this will save other users some grief. It's hard to test every situation on just my own computer so I rely on people letting me know when things don't work.

I hope you enjoy using the node.

I'm going to close this issue now, but if you have any further problems please let me know.

glibsonoran commented 8 months ago

Issue closed.

GamingDaveUk commented 8 months ago

Was working fine till today when it refuses to use the LLm (havent use in a few days so no idea when it broke) all updated

 Begin Log for: Advanced Prompt Enhancer, Node #56:
✦ WARNING: Local LLM server is not running; aborting client setup.
✦ WARNING: Open Source LLM server is not running.  Aborting request.

Unable to process request. Make sure the local Open Source Server is running.

LLM is up and running (confirmed with SillyTavern) Workflow was saved and no alterations made to it.

Fast report as off to work in a bit (its ok I know it wont be fixed today, nor do i expect it to be fixed fast)

glibsonoran commented 8 months ago

OK let me take a look.

glibsonoran commented 8 months ago

Try it now.
But I'll say I can't get it to run on Oobabooga at all anymore. They seem to have changed something about the request pattern. Maybe it'll work for you though.

GamingDaveUk commented 8 months ago

Try it now. But I'll say I can't get it to run on Oobabooga at all anymore. They seem to have changed something about the request pattern. Maybe it'll work for you though.

just home from work, still no go but seems to be the error you mention now:

➤ Begin Log for: Advanced Prompt Enhancer, Node #56:
✦ INFO: Server returned response code: 404
✦ INFO: Setting Openai client with URL, no key.
✦ INFO: Setting client to OpenAI Open Source LLM object
✦ ERROR: Server STATUS error 500: <Response [500 Internal Server Error]>. File may be too large.
✦ ERROR: Server was unable to process this request.

on Textgen Webui side:

Exception in ASGI application
Traceback (most recent call last):
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\uvicorn\protocols\http\httptools_impl.py", line 419, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\fastapi\applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\errors.py", line 186, in __call__
    raise exc
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 74, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\fastapi\routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\fastapi\routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\extensions\openai\script.py", line 137, in openai_chat_completions
    response = OAIcompletions.chat_completions(to_dict(request_data), is_legacy=is_legacy)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\extensions\openai\completions.py", line 536, in chat_completions
    return deque(generator, maxlen=1).pop()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\extensions\openai\completions.py", line 315, in chat_completions_common
    prompt = generate_chat_prompt(user_input, generate_params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\modules\chat.py", line 97, in generate_chat_prompt
    user_bio=replace_character_names(state['user_bio'], state['name1'], state['name2']),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\modules\chat.py", line 636, in replace_character_names
    text = text.replace('{{user}}', name1).replace('{{char}}', name2)
           ^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'replace'
Exception in ASGI application
Traceback (most recent call last):
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\uvicorn\protocols\http\httptools_impl.py", line 419, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\fastapi\applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\errors.py", line 186, in __call__
    raise exc
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\middleware\exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\starlette\routing.py", line 74, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\fastapi\routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\installer_files\env\Lib\site-packages\fastapi\routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\extensions\openai\script.py", line 137, in openai_chat_completions
    response = OAIcompletions.chat_completions(to_dict(request_data), is_legacy=is_legacy)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\extensions\openai\completions.py", line 536, in chat_completions
    return deque(generator, maxlen=1).pop()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\extensions\openai\completions.py", line 315, in chat_completions_common
    prompt = generate_chat_prompt(user_input, generate_params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\modules\chat.py", line 97, in generate_chat_prompt
    user_bio=replace_character_names(state['user_bio'], state['name1'], state['name2']),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\AiStuff\text-generation-webui\modules\chat.py", line 636, in replace_character_names
    text = text.replace('{{user}}', name1).replace('{{char}}', name2)
           ^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'replace'

tried with a few random models, silly tavern still works though so maybe comfy did something?

GamingDaveUk commented 8 months ago

I am very tired so hopefully copied the error out in full etc

glibsonoran commented 8 months ago

Hmm it seems to be complaining that a variable named user or user_bio is set to None. I don't pass any user information, it seems to be viewing this in terms of setting up a chat prompt, which is probably something more to do with a connection to something like silly tavern.

glibsonoran commented 8 months ago

Just out of curiosity, you do have the OpenAI API checked on the extensions panel, and you aren't trying to pass in an image... yes?

GamingDaveUk commented 8 months ago

Just out of curiosity, you do have the OpenAI API checked on the extensions panel, and you aren't trying to pass in an image... yes?

Yep openAI is checked (and is how silly tavern communicates with it)

not sending an image, this was a saved workflow that was working for me a few days ago, and with only the prompt changed has stopped working (tried it again with out the prompt change and still no go)

Just updating textgen as about to do some LLM work (though the git pull said no change so guessing thats not the issue)

Looking through the commits to Silly tavern the only one I could see that mentiones textgen is a refactoring of the settings... I havent coded in a while, so I am not sure if this is even remotly related to why t works on silly tavern but not the plush node. https://github.com/SillyTavern/SillyTavern/commit/c8f84bd41367d2a4cf1c3f4299d9bafa19000c17

EDIT: just looks like a set of elseif's changed to switch statements (much neater) not relevant to the issue though.

glibsonoran commented 8 months ago

Just saw this on the Ooga reddit:

level 1 Valuable_Option7843 · 13 days ago

Just a note - The webtextgen api is not currently compatible with OpenAI libs, it has not kept up with drift. 1 User avatar level 2 Tuxedotux83 Op · 13 days ago

By not compatible you mean like a deal breaker? Or just requires some type of integration and/or adapter in order to be able to make things work? 1 User avatar level 3 Valuable_Option7843 · 13 days ago

Dealbreaker unless you want to write your own library 1 User avatar level 4 Tuxedotux83 Op · 13 days ago

Technically I did not plan to use LangChain with text-generation-webui API as an API, more likely utilize LangChain for the PDF loading, chunking and somehow create the vector DB for the document with Instructor (or similar) and feed it to the ooba API?

Let me know if any of what I wrote does not make sense 2 User avatar level 5 Valuable_Option7843 · 13 days ago

That will work but you will not be able to leverage OpenAI libs for that last step as their advertised OpenAI compatible api is not compatible for the moment. I’m sure it will be fixed, just a heads up for now. 2 User avatar level 6 Tuxedotux83 Op · 13 days ago

Thank you for the information, second thought- is there a way to provide oobabooga API a context window ? Then I could just extract the text from the PDF (as long as its not longer than say 15-20 pages), set the context to this text and then start asking questions ? 1 User avatar level 7 Valuable_Option7843 · 13 days ago

I haven’t tried that, others may be able to advise 2

glibsonoran commented 8 months ago

What model are you using with Comfy? If you're using a .gguf, I might recommend using Koboldcpp until either I or Ooba resolves this issue. It's lightweight, simple and seems to connect reliably. Either way I'll look into it further, but if the compatibility issue is the real problem behind this, it may take me a while to sus this out.

GamingDaveUk commented 8 months ago

What model are you using with Comfy? If you're using a .gguf, I might recommend using Koboldcpp until either I or Ooba resolves this issue. It's lightweight, simple and seems to connect reliably. Either way I'll look into it further, but if the compatibility issue is the real problem behind this, it may take me a while to sus this out.

Exl2's mostly, no set model as i like to try a few. The recent exploit discovered with GGUF make me reluctant to go back to those. I am trying to suss out TabbyAPI ( https://github.com/theroyallab/tabbyAPI ) all installed though no UI that I can see, seems to relie on sillytavern to change the model...which is odd and i think i must have that wrong... but is an OpenAI api that is designed for exl2. Noticed its mentioned in a few model cards to use it instead of textgen due to load issues. If i get it to work etc I will update here.

The alternative is, I wait for Textgen to fix the API, looks like the issue is not with Plush (from reading that reddit), though I like the idea of tabby, honestly I used textgen as a model loader and api not as anything more than that. For generating songs, stories, D&D stuff theres silly tavern and for ai prompt assistance in image gen theres your nodes.

Just wish these things would not keep breaking lol (not your fault, more thinking of the openai api maintainers who must have changed something) Right.....Nope... I am ranting lol

Will update the thread if I get TabbyAPI to work, installing was easy.

GamingDaveUk commented 8 months ago

Well TabbyAPI may have revealed another issue with "other LLM" or it may be tabby at fault shrug Tabby needs and uses an api key, I followed the instructions to get that into my windows variables. Howerver its coming put as unauthorised.... though while constructing this reply i think i see why: setx OAI_KEY “(your key)" in the readme has an odd " and that has added itself to the api key, resetting it does not seem to have changed it, checked in environmental value and adjusted but suspect i may need to reboot. (doing that in a second and will edit) The error in the comfyui log is: HTTP Request: POST http://127.0.0.1:5000/v1/chat/completions "HTTP/1.1 401 Unauthorized" and the error in the node is:

 Begin Log for: Advanced Prompt Enhancer, Node #56:
✦ INFO: Setting client to OpenAI Open Source LLM object
✦ ERROR: Server STATUS error 401: <Response [401 Unauthorized]>. File may be too large.
✦ ERROR: Server was unable to process this request.

The llm url for TabbyAPI is set to http://127.0.0.1:5000/v1 so its clearly adding the correct extention to that url: from TabbyAPI console:

INFO:     ExllamaV2 version: 0.0.16
INFO:     Your API key is: *SNIP... not that it matters as its local*
INFO:     Your admin key is: *SNIP... not that it matters as its local*
INFO:
INFO:     If these keys get compromised, make sure to delete api_tokens.yml and restart the server. Have fun!
INFO:     Generation logging is disabled
INFO:     Developer documentation: http://127.0.0.1:5000/redoc
INFO:     Completions: http://127.0.0.1:5000/v1/completions
INFO:     Chat completions: http://127.0.0.1:5000/v1/chat/completions
INFO:     Started server process [12824]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

Right, going to reboot and see if that fixes the api variable

glibsonoran commented 8 months ago

Yah that's only used for OpeaAI's chat-gpt. Other LLM doesn't fetch the key since most open source LLM's don't require one.

GamingDaveUk commented 8 months ago

Reboot has the right key show up for the echo command now. But still 401 Unauthorised.

Yah that's only used for OpeaAI's chat-gpt. Other LLM doesn't fetch the key since most open source LLM's don't require one.

Is it possible to get a tick box or true/false boolean for api key use? That would support TabbyAPI and any others that need an api key in the future? (no rush, just an idea)

glibsonoran commented 8 months ago

That Tabby message seems to indicate that if you're using it locally it doesn't need a key, not a real one anyway.

GamingDaveUk commented 8 months ago

That Tabby message seems to indicate that if you're using it locally it doesn't need a key, not a real one anyway.

If i change the api key in sillytavern (changing a 5 to a 4 for example) then the api wont connect and tabby gives a 401 error. Putting the right key in and it connects. So i suspect it is needed to be correct. I could likely disable the need for an API key in Tabby's config, however I would prefer to have it enabled as I do use textgen remotely at times, so tabby being more secure appeals to me. Supporting an API key on local LLM may also mean that non local services would work for your node? allowing people to enter the url and have an api key saved could open access for claude etc? but thats speculation on my part.

GamingDaveUk commented 8 months ago

As a test I disabled authentication. That did disable both admin and api keys. However when I ran comfyui I got:

➤ Begin Log for: Advanced Prompt Enhancer, Node #56:
✦ INFO: Setting client to OpenAI Open Source LLM object
✦ ERROR: Server STATUS error 422: <Response [422 Unprocessable Entity]>. File may be too large.
✦ ERROR: Server was unable to process this request.

in comfyui log: TTP Request: POST http://127.0.0.1:5000/v1/chat/completions "HTTP/1.1 422 Unprocessable Entity" and in TabbyAPI: INFO: 127.0.0.1:55724 - "POST /v1/chat/completions HTTP/1.1" 422 Silly tavern was able to use it with out issue so its an interaction between the node and the api. So this may not be a viable alternative lol I am however out of time for today (hoping that the logs and issues helps) gonna link the discord channel to this issue as maybe the dev of TabbyAPI will know why its now 422 error, thats not one i have ever heard of....but then I am no expert.

glibsonoran commented 8 months ago

Well I found a good thread on the Oobabooga problem:. I got it running but I had to build a whole alternate request function that uses an http: POST, so I gotta test it thoroughly before I can release it.

glibsonoran commented 8 months ago

I also added code to pick-up a key from a LLM_KEY env variable. This would be applied to connections via the OpenAI API Object applied to open source front-ends (which is what's used with the "Other" selection) and connections via http POST (the new code I'm adding that will be under the 'Oobabooga API-URL' selection). This way you can have a key for 3rd party LLM/Opensource front-ends and a key for ChatGPT simultaneously. I haven't tested this thoroughly yet so I need to make sure it doesn't break anything. The way it would work is that it would use the key for connections all the time if the env variable (LLM_KEY) was populated and wouldn't use it if it wasn't. There's no switch in the node. Front-ends that don't require a key tend to just ignore whatever is in the key attribute.

glibsonoran commented 8 months ago

OK the new version is up. Try connecting using the Oobabooga API-URL connection in the LLM field. With this type of connection your url will need to include: /chat/completions, for example: http://127.0.0.1:5000/v1/chat/completions.

Let me know if it works. I also added the ability to create a key for Open source etc LLM front-ends, API. Just put your key in environment variable: LLM_KEY and it will automatically be applied to all connections except ChatGPT connections.

GamingDaveUk commented 8 months ago

Just seen the message but now heading to sleep, will test both out when I awake. Crossing fingers Tabby will work as it seems to be dealing with models better, may consistent results in silly tavern and less vram usage on the same models (not massively less but more in line with what is reported in model cards)

On Sun, 24 Mar 2024, 20:36 glibsonoran, @.***> wrote:

OK the new version is up. Try connecting using the Oobabooga API-URL connection in the LLM field. With this type of connection your url will need to include: /chat/completions, for example: http://127.0.0.1:5000/v1/chat/completions.

Let me know if it works. I also added the ability to create a key for Open source etc LLM front-ends, API. Just put your key in environment variable: LLM_KEY and it will automatically be applied to all connections except ChatGPT connections.

— Reply to this email directly, view it on GitHub https://github.com/glibsonoran/Plush-for-ComfyUI/issues/53#issuecomment-2016935532, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7JQMJJOD45P3TTO4ODHRRLYZ42LNAVCNFSM6AAAAABEQUBD4CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWHEZTKNJTGI . You are receiving this because you authored the thread.Message ID: @.***>

GamingDaveUk commented 8 months ago

OK the new version is up. Try connecting using the Oobabooga API-URL connection in the LLM field. With this type of connection your url will need to include: /chat/completions, for example: http://127.0.0.1:5000/v1/chat/completions.

Let me know if it works. I also added the ability to create a key for Open source etc LLM front-ends, API. Just put your key in environment variable: LLM_KEY and it will automatically be applied to all connections except ChatGPT connections.

While setting up to test this, I noticed the github readme still has that incorrect " on "setx OAI_KEY “(your key)" " This “ gets added to the key on windows if you copy the text paste into cmd and replace (your key). Just a heads up.

GamingDaveUk commented 8 months ago

OK the new version is up. Try connecting using the Oobabooga API-URL connection in the LLM field. With this type of connection your url will need to include: /chat/completions, for example: http://127.0.0.1:5000/v1/chat/completions.

Let me know if it works. I also added the ability to create a key for Open source etc LLM front-ends, API. Just put your key in environment variable: LLM_KEY and it will automatically be applied to all connections except ChatGPT connections.

Set up like this and it all works with TabbyAPI. Cheers for the quick fix chap.

GamingDaveUk commented 8 months ago

I put this into the support channel of TabbyAPI's discord so anyone searching should be able to make it work,

glibsonoran commented 8 months ago

Good news that it works. Did you try using it with "Other LLM" and Tabbi with a key (you'll have to set your url back to /v1)? Are you using it to drive Ooba or does Tabi handle models directly?

GamingDaveUk commented 8 months ago

If i try it as Other LLM to connect to TabbyAPI i get

➤ Begin Log for: Advanced Prompt Enhancer, Node #56:
✦ INFO: Server returned response code: 404
✦ INFO: Setting Openai client with URL and key.
✦ INFO: Setting client to OpenAI Open Source LLM object
✦ ERROR: Server STATUS error 422: <Response [422 Unprocessable Entity]>. File may be too large.
✦ ERROR: Server was unable to process this request.

on comfyui: HTTP Request: POST http://127.0.0.1:5000/v1/chat/completions "HTTP/1.1 422 Unprocessable Entity"

on Tabbyapi: INFO: 127.0.0.1:52521 - "POST /v1/chat/completions HTTP/1.1" 422

So Oobabooga API-URL is the way to go for TabbyAPI.

I am using https://github.com/theroyallab/tabbyAPI-gradio-loader to choose and configure models on tabbyAPI, so currently not using TextGen Webui at all, hoping that will mean less breaks in the future as Tabby is just an API, nothing uneeded extra that can break lol

glibsonoran commented 8 months ago

Did you use the whole path: http://127.0.0.1:5000/v1/chat/completions with Other LLM? Because that won't work, you have to go back to: http://127.0.0.1:5000/v1. Maybe you did, I just want to make sure.

GamingDaveUk commented 8 months ago

Did you use the whole path: http://127.0.0.1:5000/v1/chat/completions with Other LLM? Because that won't work, you have to go back to: http://127.0.0.1:5000/v1. Maybe you did, I just want to make sure.

In other LLM I didnt use the full path just the v1 ending one. in the textgen webUI one i used the full path. :)

glibsonoran commented 8 months ago

OK... I was just surprised because Tabbi lists itself as OpenAI compatible and "Other LLM" uses the OpenAI object. "Ooba API" is a web based request that's formatted like an OpenAI request, but I'd think an OpenAI compatible API would be able to use the object.

Anyway, I'm glad it's working. Maybe I'll download Tabbi and see what I can find out.

GamingDaveUk commented 8 months ago

I used this model in my tests: https://huggingface.co/bartowski/mistral-orpo-capybara-7k-exl2 just incase other models have different results as cant test much today.

glibsonoran commented 8 months ago

closed

glibsonoran / Plush-for-ComfyUI

Confused on how to use with a local LLm (via Text Gen WebUI) #53