alexriggio / BERT-LoRA-TensorRT

This repository contains a custom implementation of the BERT model, fine-tuned for specific tasks, along with an implementation of Low Rank Approximation (LoRA). The models are optimized for high performance using NVIDIA's TensorRT.
Apache License 2.0
56 stars 7 forks source link

CUDA out of memory #4

Open ZZZsleepyheadZZZ opened 8 months ago

ZZZsleepyheadZZZ commented 8 months ago

Thank you for the well-organized code and detailed notes, which makes the life more easier for a beginner like me.

However during my evaluating process I came up with the 'CUDA out of memory' error (when the training process is done, and. there is no other running process). I doubted that it may be caused by memory leak issue.

alexriggio commented 8 months ago

Yeah that is strange there is an issue with just evaluating considering fine-tuning also includes an evaluation step for each epoch.

Some quick things to try:

Just to confirm, when you restart the kernel (clearing the memory) and go straight to evaluating (skip over the fine-tuning) you still run into the 'CUDA out of memory' error?

Another workaround is to try reducing the batch size. In STEP 1: Data Preprocessing, set batch size=1 for the test_dataloader.