Incomplete Response from 4bit Version of PhoGPT

Hello, I made some testing on 4bit and 8bit version of PhoGPT. I got issue with 4bit version detail is below:

Environment: PhoGPT Version: 4bit Execution Environment: Google Colab with T4 GPU

Issue Description: When using the 4bit version of PhoGPT with the provided initialization code from the documentation, the model returns an incomplete response. Specifically, it only returns a newline character \n, in contrast to the 8bit version, which functions correctly and returns a comprehensive output.

Steps to Reproduce: Initialize the 4bit PhoGPT model using the sample code from the official documentation. Use instruction = "Viết bài văn nghị luận xã hội về an toàn giao thông" Observe that the response is only a newline character, indicating an incomplete or failed generation.

Expected Behavior: The 4bit version of PhoGPT should return a complete and coherent response similar to the 8bit version, which returns detailed and lengthy outputs.

Actual Behavior: The 4bit version outputs only a newline character \n, indicating an error or issue in processing the input prompt.

8bit

4bit

VinAIResearch / PhoGPT

Incomplete Response from 4bit Version of PhoGPT #27