Open lzzppp opened 8 months ago
It is indeed a serious problem but unfortunately I cannot locate the cause for you. I have read your environment configuration and there should be no compatibility issues, or maybe:
Finally, if you have problems configuring the official mmrotate, you can also report this issue under their GitHub.
Yes, I do use a 4090 GPU. Yesterday I repeatedly modified the environment, including changing the internal code of some environmental restrictions, and now it can run. Unfortunately, I don't know why.
@lzzppp @yuhongtian17 Hi, bro. I want to ask you a question that I only have two 4090 GPU. Are they enough for training the STD model? I am looking forward to your reply
@BiangBiangH It depends on the amount of data. If the amount of data is not large, only less than one thousand pictures, two 4090s are enough.
@lzzppp ok thank you!
Hello Team Leader,
I've encountered an issue while using your project for distributed training where the GPU utilization is very high, but the video memory usage does not reach the expected amount, leading to the program getting stuck. I've followed the setup instructions provided in your documentation and am running the training script with the following configuration:
Environment:
OS: Ubuntu 22.04.2 LTS Python version: 3.7.16 PyTorch version: 1.7.0 CUDA version: 11.0
Pip list:
Expected Behavior: I expected the training process to utilize the GPU memory more efficiently and for the training to proceed without getting stuck.
Actual Behavior: The training process gets stuck indefinitely with high GPU utilization and low video memory usage.
Additional Context:
I have verified that a single GPU can be normal. I also tried adjusting the DataLoader's batch size and number of workers, but the problem persists. Can you provide any insights or suggestions on how to resolve this issue? I'm wondering if there are any configuration steps I've overlooked, or if there are known compatibility issues with the given version of PyTorch and CUDA.
Thanks for your time and help.
Best Regards, Zepeng Li