Load in 4 bit? - Githubissues

johnwick123f commented 1 week ago

Is there a way to load this in 4 bit? That would help a lot for users with low vram! Btw, great project!

chenjoya commented 1 week ago

Thank you for the attention! I will take a look at bitsandbytes recently. Will update on Wednesday.

chenjoya commented 5 days ago

Sorry, there are some bugs when I use bitandbytes quantization_config:

ValueError: weight is on the meta device, we need a `value` to put in on 0.

which may be due to the extra connector layer:

self.connector = torch.nn.Sequential(
    torch.nn.Linear(config.vision_hidden_size, config.hidden_size, bias=True),
    GELUActivation(config.hidden_size),
    torch.nn.Linear(config.hidden_size, config.hidden_size, bias=True),
)

The nn.Sequential will make it cannot retrieve the weight (I guess?)... I still recommend you to use GPU with higher memory...

johnwick123f commented 2 days ago

@chenjoya oh ok, thanks anyway. I'll try it to use it with a higher memory gpu!

showlab / videollm-online

Load in 4 bit? #4