Open hurun opened 1 year ago
I tried to find the reason why the acc of the ft model is 0. From the output results of the hf and ft models, I found that the ft model always predicted that the target result id was 90610. The other input parameters are the same, such as input_ids and infer_params
I also tried some other experiments, such as running the same steps on other Nvidia graphics cards gtx2060, and the results are correct.
{
"model_answer": "-\u00e0-vis",
"output_ids": [
90610
],
"metrics": {
"acc": 0.0
}
}
FasterTransformer development has transitioned to TensorRT-LLM. Please try that. TensorRT-LLM has supported bloom officially.
Branch/Tag/Commit
main
Docker Image Version
nvcr.io/nvidia/pytorch:22.09-py3
GPU name
V100-32G
CUDA Driver
11.0
Reproduced Steps
steps 1: pull images with docker and start the container
steps 2: get faster transfomers project from git, and build project
steps 3: install python packages
steps 4: get model and data, use bloom-560m and Lambada datasets
steps 5: conver model with huggingface_bloom_convert.py
steps 6: test torch model and faster transformers model by step 5 convert tool
steps 7: show result below
HF benchmark result, Accuracy: 39.6274%
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 645/645 [12:57<00:00, 1.21s/it] Accuracy: 39.6274% (2042/5153) (elapsed time: 771.5046 sec)
FT benchmark result,
[FT][INFO] Device Tesla V100-SXM2-32GB, Accuracy: 0.0000% 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5153/5153 [01:57<00:00, 43.92it/s] Accuracy: 0.0000% (0/5153) (elapsed time: 109.4225 sec)