-
-
请问论文中Table 7和Table 8中TinyLlama的性能是用什么工具测的?我用opencompass评测,模型是[TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T),测出来MMLU、CEVAL、CMMLU…
-
Hello! Thanks for making available your work and models, alongside steps to reproduce the results.
I am trying to **generate the predictions on the MathQA dataset**, using the trained **monolingual…
-
# How well does Formulasearch work?
- [ ] https://github.com/MaRDI4NFDI/portal-compose/issues/234
- [x] https://github.com/MaRDI4NFDI/portal-compose/issues/228
# How is the ranking of search re…
-
I'm using the processed data you provided to reproduce the results for MathQA.
Following the instructions, I replace the vocab.txt for bert-base-uncased folder. I use train_ft_monolingual-en.sh. Howe…
-
Hi,
Currently I stuck at the [retriever inference part](https://github.com/czyssrs/ConvFinQA#inference). Got some errors because it looks for the `gold_ind` key in the test_turn.json data file but …
-
as in the title.
-
# URL
- https://arxiv.org/abs/2401.08967
# Affiliations
- Trung Quoc Luong, N/A
- Xinbo Zhang, N/A
- Zhanming Jie, N/A
- Peng Sun, N/A
- Xiaoran Jin, N/A
- Hang Li, N/A
# Abstract
- One …
-
- https://arxiv.org/abs/2108.07732
- 2021
本論文では、汎用プログラミング言語におけるプログラム合成のための、現世代の大規模言語モデルの限界を探る。
我々は、MBPPとMathQA-Pythonという2つの新しいベンチマークで、244Mから137Bのパラメータを持つモデルのコレクションを、少数ショットと微調整の両方の領域で評価した。
これらのベン…
e4exp updated
3 years ago
-
whenever I run your command using the CMD whether it's MathQA or AlgoLisp it always prints this line at the end and nothing happens next, I hoped it would continue after a couple of hours but it didn'…