-
### Question
I notice that eval code for ScienceQA only support single turn QA, but I want to evaluate on Multi-turn conversation task.
How can I get multi-turn response in evaluation stage?
-
Filing this ticket in support of the LLM-based 'publication filtering' tool presented by @Rosinaweber at the June 2024 Relay.
In the example below, four pubs are shown as providing direct support …
-
# URL
- https://arxiv.org/abs/2308.11432
# Affiliations
- Lei Wang, N/A
- Chen Ma, N/A
- Xueyang Feng, N/A
- Zeyu Zhang, N/A
- Hao Yang, N/A
- Jingsen Zhang, N/A
- Zhiyuan Chen, N/A
- …
-
[From rasbt post](https://x.com/rasbt/status/1754516687896887449?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ) - Flan T5 is a great go to model for text classification.
Tiny titans - Can smaller LL…
-
This issue is for the notification of papers which will be added to this repo in the future
-
Hi, I fine-tuned a model (yam-peleg/Experiment26-7B) using unsloth. Then during inference, model correctness drops when using unsloath FastLanguageModel. I see some modules are replaced. It looks a li…
-
Hi!
I am trying to reimplement the fine-tune stage of MolCA and running the code:
`python stage2.py --root 'data/PubChem324kV2/' --devices '0,1' --filename "ft_pubchem324k" --stage2_path "all_checkp…
-
Hello, I was reading this aper recently and have questions about the paper.
May I ask if medrag has selected the corpus + search+LLM method to evaluate mirrag’s data set?
-
in the article it says that gpt-3.5-turbo is used to measure the rejection rate. what explains this difference in results for chatGPT given that it is used as a reference?
![image](https://github.co…
-
I think there are a few issues being conflated here and it would be helpful to disentangle them:
We support:
- launching with `accelerate launch`, which is only meant to support Data-parallel …