Closed Iceblade02 closed 1 month ago
Selecting it in the menu I get KeyError: 'OpenRouter' so something's not right. On further inspection the OpenRouter file can't load itself properly: ModuleNotFoundError: No module named 'modeling.inference_models.openrouter.class'; 'modeling.inference_models.openrouter' is not a package
For now since you made the files Openrouter specific its also better to put them all in the subfolder, we can alter this later when we add modern OpenAI support based off this.
Oh, hang on, I managed to get in a previous WIP file, there should only be the "openrouter_handler" and "openrouter/class.py", "openrouter.py" needs to be removed
If you update your branch this PR will update automatically.
There, should be updated now.
https://github.com/scott-ca/KoboldAI-united is the incorrect header, that is a very outdated fork not associated with us.
When trying to submit my key it automatically removes the key and I end up with a blank (Same key works in Lite)
Alright, I've done somewhat more thorough testing this time around, and it seems to be behaving properly now, at least on my end.
I've reworked the handler class in /modeling/inference_models to be more generic, and as a proof-of-concept added support for another API! Adding support for Neuro.Mancer took roughly an hour.
Moved GooseAI over to the new API handler.
GooseAI may be using the old implementation of OpenAI's API. Will have to be tested for. The OpenAI API endpoint should be migrated though with the ability for custom URL's, this will allow all these newer hosting company's to work.
Currently, api_handler translates all calls to core_generate, raw_generate & _raw_generate into a call to the abstract function _raw_api_generate (unique to each specific endpoint handler), decoding the prompt if it has been tokenized, then passing it as plaintext (along with other params unchanged). As long as _raw_api_generate returns a list of strings [{"result A"}{"result B"} etc.] it will work fine, tokenizing the outputs, standardizing the lengths and returning a GenerationResult.
api_handler also provides some default functions to make api requests: (batch_api_call -> [json, json, etc], api_call -> json) taking in a call var which should contain the url, json and headers (specified in the _raw_api_generate for each endpoint handler).
Things TODO in api_handler:
NOTE: I keep triggering the following error when tabbing into the web client, and sometimes when switching between tabs in the left menu.
I don't know if it is a bug I've introduced, or if it comes from elsewhere. It feels like the latter, given that it triggers even without a model chosen, but idk.
EDIT: This seems to have been resolved
GooseAI apparently revoked the testing credit, so their backend I will no longer be able to test and have to assume works. Without testing credit it will move towards a best effort basis (Keep in mind their site is not compatible with the modern OpenAI standard to my knowledge).
I tried openrouter and this worked, but the presency penalty / frequency pentalty is shown on the loading screen and thats not confortmant with our standard. We should map the frequency penalty to the repetition penalty setting inside KoboldAI so that it can be modified without having to reload the model.
Repetition penalty is now properly tied to the setting inside the KAI front end.
Removing presence penalty & frequency penalty would require adding them as new items in gensettings.py (and maybe elsewhere?).
Do we want to add them there?
One potential downside is we'll have a whole bunch of settings with different names that aren't actually used in a lot of models. Ideally, we'd "ask" the backend which settings it actually uses/are implemented, and only present those to the user.
Added them and it seems to work. Easy to undo if we want,
The code should not affect anything outside its scope, but should definitely be glanced at by someone more experienced than me.