stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.86k stars 243 forks source link

Import models and scenarios from generative search engine verifiability paper #1514

Closed yifanmai closed 7 months ago

yifanmai commented 1 year ago

We should import over the relevant models and scenarios from https://github.com/nelson-liu/evaluating-verifiability-in-generative-search-engines

cc @nelson-liu

JosselinSomervilleRoberts commented 1 year ago

I looked at the paper and it seems that the models are:

Do we have API keys for those models? EDIT: Nevermind, those models don't seem to have APIs

JosselinSomervilleRoberts commented 1 year ago

For Bing Chat, it seems like there is no official repo or API. I found this repo though: https://github.com/acheong08/EdgeGPT/tree/master/docs

JosselinSomervilleRoberts commented 1 year ago

For NeevaAI, I did not find anything, and they announced that they have no API: https://help.neeva.com/hc/en-us/articles/4404882342803-Developer-API

JosselinSomervilleRoberts commented 1 year ago

For YouChat, I found this: https://github.com/ettiebot/chatapi/tree/main

yifanmai commented 7 months ago

Not planned.