How to mimic `--loading_half_params --use_tile_vae --load_8bit_llava` for `test.py`?

hadillivvy commented 6 months ago

When running gradio_demo.py, we have the parameters --loading_half_params --use_tile_vae --load_8bit_llava to greatly decrease RAM usage. Are there such options for test.py? If not, can you please work on adding them?

zelenooki87 commented 6 months ago

Yes, I asked it before. Please update test.py too.

include5636 commented 6 months ago

I have encounterd RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) when directly adding --loading_half_params --use_tile_vae --load_8bit_llava when using test.py So I choose to use one GPU, but OOM. Adding --no_llava made it run successfully, but not sure about the performance decrease. Maybe it could be used as a temporary substitute

CUDA_VISIBLE_DEVICES=0 python test.py --img_dir <INPUT> --save_dir <RESULT> --SUPIR_sign Q --upscale 4 --loading_half_params --use_tile_vae --load_8bit_llava --no_llava

hnc01 commented 6 months ago

I believe the --loading_half_params --use_tile_vae --load_8bit_llava params were recently added to test.py because i see them on the online repo but not in my local one. This maybe means they addressed this request? I'll test and see.

YuenFuiLau commented 6 months ago

I have encounterd RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) when directly adding --loading_half_params --use_tile_vae --load_8bit_llava when using test.py So I choose to use one GPU, but OOM. Adding --no_llava made it run successfully, but not sure about the performance decrease. Maybe it could be used as a temporary substitute
CUDA_VISIBLE_DEVICES=0 python test.py --img_dir <INPUT> --save_dir <RESULT> --SUPIR_sign Q --upscale 4 --loading_half_params --use_tile_vae --load_8bit_llava --no_llava

Workaround solution by hacking those two lines in module.py

line 492: tokens = batch_encoding["input_ids"].cpu().to(self.transformer.device) add in line 569 : text = text.cpu().to(self.model.positional_embedding.device)

in sampling.py line 51: sigma = sigma.cpu().to(x.device)

Fanghua-Yu / SUPIR

How to mimic `--loading_half_params --use_tile_vae --load_8bit_llava` for `test.py`? #34