Details about the code - Githubissues

Hon-Chen commented 8 months ago

Really solid work! I noticed that the proportion of non-zero parameters was counted after quantization. What is the motivation for this?

quant_sequential(model, dataloader, device)
for n, p in model.named_parameters():
    print(n, torch.mean((p == 0).float()))
    if "fc2" in n:
        break

hahnyuan commented 8 months ago

Thank you for bringing up this observation. The inclusion of non-zero parameters after quantization in the code was initially implemented for debugging purposes. However, upon further evaluation, it has been determined that this code segment is no longer necessary and can be safely removed.

Hon-Chen commented 8 months ago

Thank you for bringing up this observation. The inclusion of non-zero parameters after quantization in the code was initially implemented for debugging purposes. However, upon further evaluation, it has been determined that this code segment is no longer necessary and can be safely removed.

Thanks for your reply, I also noticed that this code is unnecessary. But when I made some modifications to the mask search method, I found that the proportion of 0 values has become higher. So I want to understand the meaning behind this piece of code, that is, what are the debugging purposes you mentioned here specifically? According to my understanding, the binarized weights will no longer appear "0", so is 0 brought by the salient weights?

hahnyuan / PB-LLM

Details about the code #4