-
## TODO
**1st iteration**
- [x] Dump the assessments into the `evaluation.csv` every time a task is executed
**2nd iteration**
- [x] Create the other CSVs from the `evaluation.csv`
- read…
-
### Model ID
ai21labs/Jamba-v0.1
### Model type
Decoder model (e.g., GPT)
### Model languages
- [x] Danish
- [x] Swedish
- [x] Norwegian (Bokmål or Nynorsk)
- [x] Icelandic
- [x] Faroese
- [x] Ge…
-
I sincerely want to evaluate your model, how can I run it more simply
-
If we run them on multiple machines, we want to easily combine them.
### TODO
**1st iteration**
- [x] Remove the `models-summed.csv` and `-summed.csv` files
- [x] Remove all the evaluation CS…
-
I am running llama2 model in wikitext dataset. I just want try some other metrics so I modify the default YAML file(`lm-evaluation-harness/lm_eval/tasks/wikitext/wikitext.yaml`) to the following, just…
-
Can you add the code for reproducing the main results in the paper for various math and coding datasets, along with their prompts and the data splits used?
-
### Model ID
state-spaces/mamba-2.8b-hf
### Model type
State space model (e.g., Mamba)
### Model languages
- [x] Danish
- [x] Swedish
- [x] Norwegian (Bokmål or Nynorsk)
- [x] Icelan…
-
I am training PaddleOCR for Tamil language and when I train the model with 70 images for training and 24 images for evaluation.
After 1st epoch the model take more time (More than 5 hrs) on Evaluat…
-
### Latest Code:
[model evaluation](https://github.com/artc-dsc/AI-FusionCast-Analysis/blob/dev/scripts/subprocess_model_evaluation.py)
[model prediction](https://github.com/artc-dsc/AI-FusionCast-…
-
Hi there! Thanks for publicizing such an awesome project!
I would like to ask how I can get the results of model evaluation similar to the results shown in the paper? Because it seems like only th…