Open abhisheksgumadi opened 2 years ago
Do you have custom code which could have a memory leak?
We have a a custom dataloder that loads images and text from a parquet file.
We have 1 Million images stored on disk and we have prepared the JSON file as described in the Github read me page. The Dataloader we have loads the json file in memory in the __init__
method and then in the __get_item__
method it loads the image from the corresponding path inside the json file. Also returns back the text.
Now sure why the RAM utilization is so high? Any idea please? Thanks
Hi, it could be related to the dataloader.
We ended up using the pretrain_dataset.py
file and formatted the data as a json file exactly as mentioned in the readme file. Even then we see the RAM utilization go to 100%. So now we have just formatted the dataset as required with no changes to the code. So we dont even have our own custom code.
We are happy to follow any other debugging steps to make this a success please. - thanks
Was wondering if there has been any update on this. We ran the pretrain.py and saw the same issue: RAM size increases when the jason files are being read and at some point, RAM explodes. For pretraining, what python version did you use and what was the RAM size?
@abhisheksgumadi @asgsaeid You may want to try out our new library which supports BLIP and see if the issue still remains: https://github.com/salesforce/LAVIS
Thanks, will take a look
Was wondering if there has been any update on this. We ran the pretrain.py and saw the same issue: RAM size increases when the jason files are being read and at some point, RAM explodes. For pretraining, what python version did you use and what was the RAM size?
Have you solved this problem?Could you kindly provide some suggestions ?
Thanks, will take a look
Have you solved this problem?Could you kindly provide some suggestions ?
Dear Team,
I am using the pre-training script to pre-train BLIP on a custom dataset (containing around 1M image/text pairs).
I see that the machine RAM utilization continuously increases and at a point it reaches 100%. The machine has 120GB RAM!.
Any idea where the problem could be?