Question about GPU Memory Usage with LoRA in VisionEncoderDecoderModel

eclickECNU commented 1 year ago

System Info

transformers 4.34.0
torch 2.0.1
peft 0.5.0

Who can help?

@BenjaminBossan

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder
[X] My own task or dataset (give details below)

Reproduction

I have observed that when using LoRa with VisionEncoderDecoderModel, there is no significant change in GPU memory. The code I am running is as follows:

from tqdm import tqdm
from torch.optim import Adam
import torch
from transformers import VisionEncoderDecoderModel, VisionEncoderDecoderConfig, ViTConfig, GPT2Config
from peft import LoraConfig, get_peft_model

# Configuration setup
config_encoder = ViTConfig()
config_decoder = GPT2Config()
config = VisionEncoderDecoderConfig.from_encoder_decoder_configs(config_encoder, config_decoder)
config.decoder_start_token_id = 50256
config.pad_token_id = 0

# Model initialization
model = VisionEncoderDecoderModel._from_config(config)
model.cuda()

# LoRa configuration
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    lora_dropout=0.5,
    target_modules=[
        'query',
        'key',
        'value',
        'intermediate.dense',
        'output.dense',
        'wte',
        'wpe',
        'c_attn',
        'c_proj',
        'q_attn',
        'c_fc'
    ],
)
model = get_peft_model(model, lora_config)

# Optimizer setup
params = [p for p in model.parameters() if p.requires_grad]
optimizer = Adam(params, lr=1e-4)

# Training loop
for _ in tqdm(range(200)):
    data = {
        'pixel_values': torch.ones((64, 3, 224, 224)).cuda(),
        'labels': torch.ones((64, 300)).cuda().long(),
        'decoder_attention_mask': torch.ones((64, 300)).cuda().long(),
    }

    output = model(**data)
    loss = output['loss']
    optimizer.zero_grad()
    output['loss'].backward()
    optimizer.step()

    # Display GPU memory usage
    allocated_memory = torch.cuda.memory_allocated()
    cached_memory = torch.cuda.memory_cached()

    print(f'Allocated Memory: {allocated_memory / (1024 ** 3):.4f} GB')
    print(f'Cached Memory: {cached_memory / (1024 ** 3):.4f} GB')
    print('=====================')

Expected behavior

In my experiments:

without LoRa, Allocated Memory : 9.2GB, Cached Memory: 67.6GB.
With LoRa, Allocated Memory : 6.6GB, Cached Memory: 68.5GB.

After checking the GPU memory using nvidia-smi, I noticed that there is almost no change in the overall GPU memory. How can I address this issue?

BenjaminBossan commented 1 year ago

Hey, thanks for the report. I assume you're referring to the reserved memory, which is basically the same for both experiments. I'm not really sure how to explain this. Unfortunately, I could not run your script currently (GCP is out of GPUs in my region). I ran a similar script using a vision transformer model and there, the reserved memory was smaller with LoRA.

In general, I find it strange that the reserved memory is so high to begin with compared to allocated memory. Do you know if that is expected?

eclickECNU commented 1 year ago

I'm not entirely sure if this is normal, but my program has exhibited this behavior from the beginning. I'm very confused as well. After running LoRA, the allocated memory does show a decrease(from 9.2GB to 6.6GB), but overall, the GPU memory remains almost unchanged(nearly 70GB)

Hey, thanks for the report. I assume you're referring to the reserved memory, which is basically the same for both experiments. I'm not really sure how to explain this. Unfortunately, I could not run your script currently (GCP is out of GPUs in my region). I ran a similar script using a vision transformer model and there, the reserved memory was smaller with LoRA.

In general, I find it strange that the reserved memory is so high to begin with compared to allocated memory. Do you know if that is expected?

BenjaminBossan commented 1 year ago

Do you have any special settings on the machine that could influence how much memory PyTorch reserves? Some custom settings for PYTORCH_CUDA_ALLOC_CONF? Otherwise, I'm at my wit's end. @younesbelkada Have you ever seen this behavior with huge amounts of reserved memory?

github-actions[bot] commented 11 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

huggingface / peft