-
Hello, I created a test script which I was testing on Aarch64 platform, for distilbert inference and using wanda sparsifier:
```
import torch
from transformers import BertForSequenceClassificatio…
-
你好,我按照脚本里默认的超参数(learning rate),以及论文提到各参数配置、偏好数据,在ALMA-7B-Lora上做CPO,但是训出来的模型输出大量重复前文甚至不翻译的情况,如下图(zh->en,raw_res 是没用utils里的clean函数的结果),请问是哪里没设好超参吗?谢谢你。
![image](https://github.com/user-attachments/asse…
-
**Describe the Feature**
Can you could provide the human assessment data collected for bechmarking RAGAS metrics against human evaluations in your [paper](https://arxiv.org/pdf/2309.15217)?
**Why …
-
Greate work. Congrats.
In Fig. 7 of BAMM paper, the blue meshes are corresponding to the blue texts, and the red meshes to the red texts.
In Table 5 of the paper "Table 5: Evaluation on tempora…
-
If any MathML in the Math is invalid then the Printer will fail to print out the model, instead producing an empty string. Is it possible for the Printer to include the content of the Math element wit…
-
**Describe the bug**
The IR benchmark for [DecoderMux](https://github.com/google/xls/blob/cd901120adfa5355b5ea40adcd97bfc61af2f3c7/xls/modules/zstd/dec_mux.x) proc is prone to a random failure after i…
-
Hi,
I trained the model in Keras but want to use it in C++ for evaluation, do you have a C++ evaluation file of this model??
It would be very helpful for me if you can share any C++ code regarding…
-
**Description**
Implement a Random Forest model to predict sales using the cleaned sales dataset.
**Tasks**
Data Preparation
**Load and preprocess the sales dataset.**
Handle missing values, …
-
# URL
- https://arxiv.org/pdf/2408.02666
# Affiliations
- Tianlu Wang, N/A
- Ilia Kulikov, N/A
- Olga Golovneva, N/A
- Ping Yu, N/A
- Weizhe Yuan, N/A
- Jane Dwivedi-Yu, N/A
- Richard Yu…
-
### 软件环境
```Markdown
- paddlepaddle-gpu: 0.0.0.post120
- paddlenlp: 3.0.0b2
```
### 重复问题
- [X] I have searched the existing issues
### 错误描述
```Markdown
在跑llama3-8b的prefix-tuning微调时,开了tp后会报错Valu…