Open wyg-okk opened 1 year ago
Hi, the training is based on the pytorch_lightning and it is supposed to manage the resources correctly. You can see that the dataset class https://github.com/liuyuan-pal/SyncDreamer/blob/eb41a0c73748cbb028ac9b007b11f8be70d09e48/ldm/data/sync_dreamer.py#L57 which simply loads data here and is not supposed to cause increasing memory usage. Maybe, you can check whether the memory usage is growing or not when running the dataset solely.
Hi, the training is based on the pytorch_lightning and it is supposed to manage the resources correctly. You can see that the dataset class
which simply loads data here and is not supposed to cause increasing memory usage. Maybe, you can check whether the memory usage is growing or not when running the dataset solely.
Thank you very much. We think this is a problem with the configured environment. I am checking with docker environment and will give feedback if there is any result.
I have the same problem. Have you found the solution yet? The speed of OOM is proportional to the amount of num workers.
I have the same problem. Have you found the solution yet? The speed of OOM is proportional to the amount of num workers.
When I run the code in docker provided by the author, this problem is solved. Your can try to run with author's docker.
I have the same problem. Have you found the solution yet? The speed of OOM is proportional to the amount of num workers.
When I run the code in docker provided by the author, this problem is solved. Your can try to run with author's docker.
Thank you for your information. I also exactly use the docker env, this may indeed be a docker environment problem.
When I train with the data set of about 800k objects, the number circled in the graph keeps increasing as the number of training steps increases. My configs/syncdreamer-train.yaml is the same as provided by the author, except for the data path https://github.com/liuyuan-pal/SyncDreamer/blob/main/configs/syncdreamer-train.yaml