llm-evaluation Search Results

1000+ results
for llm-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

UKPLab/sentence-transformers #2705

AttributeError: 'NoneType' object has no attribute 'get'

I am trying to run below code from @tomaarsen's HuggingFace blog on Sentence-Transformer-V3 ### Code: ```python from datasets import load_dataset data = load_dataset('sentence-transformers…

karan842 updated 4 weeks ago
10
zou-group/textgrad #16

Asynchronous calls

Hi, Are you planning making textgrad llm calls asynchronous? I tried to start adding saynchronous methods to make at least evaluation calls and inference (everything that is forward) asynchrono…

ajms updated 2 weeks ago
3
AISG-Technology-Team/GCSS-Track-1A-Submission-Guide #5

[SiliconAttack] - [ZJU_ZhangHaonan] - [Questions about the s…

I have some questions about the submission format. Do we need to include the conversation template of the LLM in the submission? For example, should the submission be formatted like ``` [INST] How t…

HotBento updated 5 hours ago
2
explodinggradients/ragas #865

[R-237] ValueError: Unknown format code 'f' for object of ty…

[X] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug. **Describe the bug** ValueError: Unknown format code 'f' for object of type 'str' …

satadda updated 3 weeks ago
1
promptfoo/promptfoo #618

Is there any way to add custom API provider to the list of A…

I'm using the promptfoo for a customized LLM. I would like to use the options on its webserver (Using the New Eval tab to run the evaluation). Is there any solution to add my customised provider to …

shahriardn updated 3 days ago
1
Agenta-AI/agenta #1276

[AGE-273] [sub-issue] Evaluation fails when correct_answer i…

To reproduce: * Create a test set without a correct_answer column * Run an evaluation The evaluation will fail and the result will not be seen Expected behavior: The evaluation will fail, opening t…

mmabrouk updated 2 days ago
2
openmainframeproject/tac #642

Zorse

### Project description This project aims to collect a dataset of production COBOL and associated mainframe languages (JCL, REXX, PL/I) which Large Language Models (LLMs) can be fine-tuned on. It a…

slh1109 updated 1 week ago
4
SeanLee97/AnglE #80

cannot reproduce the results reported in the Espresso paper

Hi, this is a really good and useful codebase. I tried to reproduce the results reported in the paper but failed. I used the code in `README_ESE.md`: ``` WANDB_MODE=disabled CUDA_VISIBLE_DEVICES=0…

Alex357853 updated 2 weeks ago
2
irthomasthomas/undecidability #830

kieval: A Knowledge-grounded Interactive Evaluation Framewor…

- [ ] [WisdomShell/kieval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models](https://github.com/WisdomShell/kieval) # WisdomShell/kieval: A Knowledge-grounded Interacti…

ShellLM updated 1 month ago
1
vllm-project/vllm #5234

[Feature]: Add efficient interface for evaluating probabilit…

# Proposed Feature Add an efficient interface for generation probabilities on fixed prompt and completion pairs. For example: ```python # ... load LLM or engine prompt_completion_pairs = [ …

xinyangz updated 4 weeks ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for llm-evaluation

1000+ results
for llm-evaluation