williamyang1991 / VToonify

[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
Other
3.53k stars 446 forks source link

Issues with dynamic Quantization #64

Closed ssskhan closed 1 year ago

ssskhan commented 1 year ago

Hi. I tried dynamic quantization of a pretrained model but the the images are of very bad quality. Is it the expected behavior or I did something wrong? I just want to reduce memory consumption without limiting batch size.

Here's my quantization code:

MODEL_REPO = 'PKUWilliamYang/VToonify'

vtoonify = VToonify(backbone='dualstylegan')

vtoonify.load_state_dict(torch.load(huggingface_hub.hf_hub_download(MODEL_REPO,
                                                                    'models/vtoonify_d_cartoon/vtoonify_s026_d0.5.pt'),
                                    map_location=lambda storage, loc: storage)["g_ema"], strict=False)

vtoonify.eval()

quantized_model = torch.quantization.quantize_dynamic(vtoonify, {nn.Linear}, dtype=torch.qint8)

quantized_model.qconfig = torch.ao.quantization.get_default_qconfig('x86')

requires_grad(quantized_model.generator, False)
requires_grad(quantized_model.res, False)

torch.save(
    {
        #"g": g_module.state_dict(),
        #"d": d_module.state_dict(),
        "g_ema": quantized_model.state_dict(),
    },
    './converted_dynamic_vtoonify_s026_d0.5.pt'
)

And here is the output image: result

williamyang1991 commented 1 year ago

I haven't tried quantization on VToonify before. I have no idea of it. So I'm afraid I cannot tell you whether it's the expected behavior or something is wrong.

ssskhan commented 1 year ago

I want to reduce memory consumption without reducing batch size. I tried mixed precision, and further to match all tensors I have made a standard for every picture like *320320 so all tensors are of the same size and allocate minimum memory but still it is taking too much memory**. Can you share some tips on how can I reduce the memory consumption?

williamyang1991 commented 1 year ago

I'm not familiar with model compression. I only know there is a technology called model distillation, to train a student network to approach the teacher network.

ssskhan commented 1 year ago

Ok. Thank you.

ssskhan commented 1 year ago

Is gradual increase in memory expected behavior? The memory keeps increasing . I started with 4gb GPU then switched to 8gb and now 12gb but after like every other or couple of pictures the memory is increased and keep increasing by a few 100 MBs until it allocates all the memory. How to get rid of these memory leaks?

williamyang1991 commented 1 year ago

it's the expected behavior