mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.38k stars 184 forks source link

tinychat.serve.model_worker_new.py AWQ model in training mode #186

Open NigelNelson opened 4 months ago

NigelNelson commented 4 months ago

When running, I get the following output:

2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
2024-05-14 14:10:44 | WARNING | root | Caution: Your LLM is currently in training mode, ensuring accurate gradient computation. Please be vigilant, particularly regarding BatchNorm and Dropout operations.
...

Looks like model.eval() needs to be called somewhere...