Open fastfinge opened 11 months ago
Indeed, this is a missing thing. The add-on is still in alpha version and I was planning to detail this once the NVDA 2024.1 compatibility is in place. For now, the build process is quite makeshift, and I hope to improve it. Feel free to suggest your modifications, they are welcome! :)
OK, I'll wait before submitting pull requests, because without an official build process, I'm not sure my code will build on any other system at the moment.
Good news is the Python upgrade is quite easy; just upgrade libraries, and modify three lines in MainDialogue.py to cast a couple values to int.
As for openrouter.ai, all you have to do is make the base URL configurable; api.openrouter.ai/v1 is a compatible drop-in API for openai. The advantage is it offers way more models, including the LLAVA image model that's slightly faster and cheaper than GPT4 (and unsensored), several text models that are free, and it's available in countries that can't yet access the OpenAI API. Also, you top up your account with credits using stripe, and it will never automatically withdraw funds from your credit card.
Also, in future, if you want to get really fancy, openrouter.ai supports OAuth. So all a user has to do is sign up with openrouter.ai, authorize the NVDA extension, and go! No more copying API keys or anything like that. I've been using it for a couple of months for my projects, and I like it much better than using OpenAI directly.
Thank you, that's interesting! Have you had a chance to test Claude 2?
Hello @fastfinge, I've started to integrate OpenRouter while adding compatibility for NVDA 2024.1. You can test by compiling the add-on from the dev branch; dependencies should download on the first launch.
Your comments are welcome! :)
It compiled and installed perfectly on the latest NVDA beta! I was able to ask questions without issue, and all of the OpenRouter models were listed correctly. I was also able to successfully describe images with GPT4 and llava-13b. However, I encountered the following issues:
OpenRouter doesn't seem to support the TTS API. If you try to press vocalize the prompt, the addon crashes, taking NVDA down with it. You might need to disable this function if the OpenRouter checkbox is checked.
When I try to describe an image with nous-hermes-2-vision, I get a 404 error, saying the model could not be found. I'm not sure what's going on here.
It looks like the list of models is out of date. Several models are missing from the list in NVDA. Supported models are here: https://openrouter.ai/models
However, the supported models change quite frequently. You need to call the openrouter endpoint to get the latest list, rather than hard-coding it, as it changes two or three times a month. It's at: https://[openrouter.ai/api/v1/models](https://openrouter.ai/api/v1/models). If the ID has the string "vision", it can describe images, otherwise it cannot.
But this is amazing for something still in the dev branch!
1. OpenRouter doesn't seem to support the TTS API. If you try to press vocalize the prompt, the addon crashes, taking NVDA down with it. You might need to disable this function if the OpenRouter checkbox is checked.
In fact, we should be able to use OpenAI or whisper.cpp, even if we have provided an OpenRouter access.
Given that I also intend to integrate Ollama (with the possibility of having as many instances as desired), I will need to rethink a lot of things to make it as flexible as possible.
2. When I try to describe an image with nous-hermes-2-vision, I get a 404 error, saying the model could not be found. I'm not sure what's going on here.
It is currently working for me. But answers are very long and descriptions rather poor, IMHO. That being said, it's in alpha state. :)
3. It looks like the list of models is out of date. Several models are missing from the list in NVDA. Supported models are here: https://openrouter.ai/models
However, the supported models change quite frequently. You need to call the openrouter endpoint to get the latest list, rather than hard-coding it, as it changes two or three times a month. It's at: https://[openrouter.ai/api/v1/models](https://openrouter.ai/api/v1/models). If the ID has the string "vision", it can describe images, otherwise it cannot.
You're right. I had started doing it this way because it allowed for the translation of the model descriptions, but it adds a significant number, and these are changing too much. It's not sustainable. I just changed that on the dev branch. We can get model descriptions simply pressing the spacebar while focusing a model item in the list. I'll add prices and other available info.
4. When I click to see my API usage, I'm still taken to the OpenAI website. There's a similar page at openrouter.ai this should link to instead. But if you wanted to be really slick, you could just display it directly in the addon: https://openrouter.ai/api/v1/auth/key
Oh, that's great! I'll see if MistralAI and OpenAI allow it too.
Thanks!
Hi:
I was interested in getting a version of this working with the latest NVDA alphas, to see how hard the Python upgrade would be. Turns out, it's not that hard. However, I don't understand the build process. There's no requirements.txt, and when I run scons, it generates a non-working addon, as the libraries don't get included at build time. I was able to get it working by having pip install the required libraries into /lib, but that can't be the intended process. I'd be happy to fork this addon and release my version that works with the NVDA alphas, and also includes support for using openrouter.ai, but the only way I can figure out to build the addon is a terrible mess.
If you're curious, the only changes I needed to make were a couple in the main dialogue (because Python 3.11 doesn't cast things to int automatically), upgrading several dependencies, and adding a couple libraries (numpy, secrets, etc.).