lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. šŸ”„ šŸ”„ šŸ”„
Apache License 2.0
2.23k stars 249 forks source link

Does it work on windows? #404

Open sdurmustalipoglu opened 1 month ago

sdurmustalipoglu commented 1 month ago

Hello,

I installed rtdetrv2_pytorch with requirements text but It didn't work in the below code CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=9909 --nproc_per_node=4 tools/train.py -c path/to/config --use-amp --seed=0 &> log.txt 2>&1 & it gives error and changed as torchrun --master_port=9909 --nproc_per_node=4 tools/train.py -c path/to/config --use-amp --seed=0

`(rdetr1) C:\Users\sdurmus>torchrun --master_port=9909 --nproc_per_node=4 tools/train.py -c path/to/config --use-amp --se ed=0 > log.txt W0803 15:25:11.983889 2408 torch\distributed\elastic\multiprocessing\redirects.py:27] NOTE: Redirects are currently not supported in Windows or MacOs. W0803 15:25:12.013218 2408 torch\distributed\run.py:757] W0803 15:25:12.013218 2408 torch\distributed\run.py:757] W0803 15:25:12.013218 2408 torch\distributed\run.py:757] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0803 15:25:12.013218 2408 torch\distributed\run.py:757] [W socket.cpp:697] [c10d] The client socket has failed to connect to [SDT]:9909 (system error: 10049 - Ā¦stenen adres iļ¢¼eriĀ­inde geļ¢¼erli deĀ­il.). C:\Users\sdurmus\anaconda3\envs\rdetr1\python.exe: can't open file 'C:\Users\sdurmus\tools\train.py': [Errno 2] No such file or directory C:\Users\sdurmus\anaconda3\enC:\Users\sdurmus\anaconda3\envs\rdetr1\python.exe: can't open file 'C:vs\rdetr1\python.exe: can't open file 'C:\Users\sdurmus\tools\train.py': [Errno 2] No such file or directory \Users\sdurmus\anaconda3\envs\rdetr1\python.exe: can't open file 'C:\Users\sdurmus\tooC:\Users\sdurmus\tools\train.py': [Errno 2] No such file or directory ls\train.py': [Errno 2] No such file or directory E0803 15:25:17.050683 2408 torch\distributed\elastic\multiprocessing\api.py:826] failed (exitcode: 2) local_rank: 0 (pid: 17064) of binary: C:\Users\sdurmus\anaconda3\envs\rdetr1\python.exe Traceback (most recent call last): File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\sdurmus\anaconda3\envs\rdetr1\Scripts\torchrun.exe__main.py", line 7, in File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\site-packages\torch\distributed\elastic\multiprocessing\errors__init__.py", line 347, in wrapper return f(*args, **kwargs) File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\site-packages\torch\distributed\run.py", line 879, in main run(args) File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\site-packages\torch\distributed\run.py", line 870, in run elastic_launch( File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\site-packages\torch\distributed\launcher\api.py", line 132, in call__ return launch_agent(self._config, self._entrypoint, list(args)) File "C:\Users\sdurmus\anaconda3\envs\rdetr1\lib\site-packages\torch\distributed\launcher\api.py", line 263, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

tools/train.py FAILED

Failures: [1]: time : 2024-08-03_15:25:17 host : SDT rank : 1 (local_rank: 1) exitcode : 2 (pid: 12492) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2024-08-03_15:25:17 host : SDT rank : 2 (local_rank: 2) exitcode : 2 (pid: 9616) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [3]: time : 2024-08-03_15:25:17 host : SDT rank : 3 (local_rank: 3) exitcode : 2 (pid: 22120) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2024-08-03_15:25:17 host : SDT rank : 0 (local_rank: 0) exitcode : 2 (pid: 17064) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html`

lyuwenyu commented 1 month ago

C:\Users\sdurmus\anaconda3\envs\rdetr1\python.exe: can't open file 'C:\Users\sdurmus\tools\train.py': [Errno 2] No such file or directory

please check your file path according to error msg.