EQ-bench / EQ-Bench

A benchmark for emotional intelligence in large language models
MIT License
180 stars 13 forks source link

+install windows #15

Open CrispStrobe opened 6 months ago

CrispStrobe commented 6 months ago

openai api (llamafile, ollama, etc): works well (only using that would need much less python libraries though)

ooba: pexpect/threading issue in ooba.py (there would be several ways to solve this, maybe with popen_spawn.PopenSpawn?)

transformers: several issues atm, esp. FlashAttentionV2, triton only supported for python 3.10 in windows

CrispStrobe commented 6 months ago

also added support for poe.com -- which works, but the async io is tricky, i used a temporary workaround with waiting/logging (could certainly be improved, but am short of time atm)

sam-paech commented 6 months ago

Thanks for your work on this. I'd like to get windows support to a point where it all "just works". Maybe the transformers dependency issues can be solved by just having 3.10 as a requirement on windows.

I'm in the middle of attempting to replace pexpect with subprocess, because I think it's causing the issue with ooba hanging after ~30 queries. So that might help with windows compatibility.

Re: adding poe.com support -- I'm a bit wary of adding too many inferencing engines / api support beyond the most common ones because each one incurs maintenance debt. I'm not really familiar with poe.com. Are there models there that you can't get elsewhere? do a lot of people use it?

CrispStrobe commented 6 months ago

i quite understand this, yes. it was mostly for experimenting. nice thing about poe.com, besides ease of use, is the cost factor ;) & poe offers Claude-2 100k, Mixtral via Groq, Mistral-large, among others

sam-paech commented 6 months ago

Oh that would be handy to have api access to claude at least. Ok I'm convinced.

CrispStrobe commented 6 months ago

for Claude-2 you have to pay though, but atm i have this.

CrispStrobe commented 6 months ago

btw llama.cpp also uses subprocess here https://github.com/ggerganov/llama.cpp/blob/67be2ce1015d070b3b2cd488bcb041eefb61de72/examples/server/tests/features/steps/steps.py#L967