James-QiuHaoran / LLM-serving-with-proxy-models

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny model can tell you the verbosity of an LLM (with low latency!)
Apache License 2.0
22 stars 5 forks source link

task-type 0 seems not working? #2

Closed saeid93 closed 6 months ago

saeid93 commented 6 months ago

Using the task-type 0 with latency_prediction.py using the following seems not working:

python latency_prediction.py --task_type 0

and will result in the following error:

output-token-len-prediction git:(main) ✗ python latency_prediction.py --task_type 0
Loaded dataset from data/lmsys_first_round_data_vicuna_1000K
427407
Start training...
  0%|                                                                                                                          | 0/6 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/cc/LLM-serving-with-proxy-models/output-token-len-prediction/latency_prediction.py", line 519, in <module>
    train(model, 
    ^^^^^^^^^^^^
  File "/home/cc/LLM-serving-with-proxy-models/output-token-len-prediction/latency_prediction.py", line 182, in train
    labels = batch['num_tokens'].to(device)
             ~~~~~^^^^^^^^^^^^^^
  File "/home/cc/miniconda3/envs/proxy-model/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 254, in __getitem__
    return self.data[item]
           ~~~~~~~~~^^^^^^
KeyError: 'num_tokens'

It might to be due to the different way you are handling num_tokens and labels in token length prediction?

James-QiuHaoran commented 6 months ago

Thanks! It should have been fixed in the recent commit: https://github.com/James-QiuHaoran/LLM-serving-with-proxy-models/commit/5692cbff352a7bd580f92576cc9cac7c9d95bd20

saeid93 commented 6 months ago

Awesome! Thank you for your quick response.