A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.
664
stars
154
forks
source link
when training SkeletonGait++, an AssertionError occurs #209
Dear Professor,
I am replicating Skeletongai ++ using the Gait3D data set,has the following problems:
I have already creating symbolic links for heatmap and silhouette data, but when I train SkeletonGait++, an AssertionError occurs:
[2024-05-05 21:09:07] [INFO]: -------- Train Pid List --------
[2024-05-05 21:09:07] [INFO]: [1234, 1512, ..., 1128]
[2024-05-05 21:09:09] [INFO]: {'lr': 0.1, 'momentum': 0.9, 'solver': 'SGD', 'weight_decay': 0.0005}
[2024-05-05 21:09:09] [INFO]: {'gamma': 0.1, 'milestones': [20000, 30000, 40000], 'scheduler': 'MultiStepLR'}
[2024-05-05 21:09:09] [INFO]: Parameters Count: 25.56861M
[2024-05-05 21:09:09] [INFO]: Model Initialization Finished!
Traceback (most recent call last):
File "opengait/main.py", line 73, in
run_model(cfgs, training)
File "opengait/main.py", line 56, in run_model
Model.run_train(model)
File "/root/autodl-tmp/Project/OpenGait/opengait/modeling/base_model.py", line 408, in run_train
retval = model(ipts)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, kwargs)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
output = self._run_ddp_forward(*inputs, *kwargs)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
return module_to_run(inputs[0], kwargs[0])
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/autodl-tmp/Project/OpenGait/opengait/modeling/models/skeletongait++.py", line 90, in forward
assert pose.size(-1) in [44, 48, 88, 96]
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1227) of binary: /root/miniconda3/envs/AllinOne/bin/python
Traceback (most recent call last):
File "/root/miniconda3/envs/AllinOne/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/envs/AllinOne/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launch.py", line 195, in
main()
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launch.py", line 191, in main
launch(args)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launch.py", line 176, in launch
run(args)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Could you help me see what's wrong.Thank you very much and look forward to your early reply.
Dear Professor, I am replicating Skeletongai ++ using the Gait3D data set,has the following problems: I have already creating symbolic links for heatmap and silhouette data, but when I train SkeletonGait++, an AssertionError occurs: [2024-05-05 21:09:07] [INFO]: -------- Train Pid List -------- [2024-05-05 21:09:07] [INFO]: [1234, 1512, ..., 1128] [2024-05-05 21:09:09] [INFO]: {'lr': 0.1, 'momentum': 0.9, 'solver': 'SGD', 'weight_decay': 0.0005} [2024-05-05 21:09:09] [INFO]: {'gamma': 0.1, 'milestones': [20000, 30000, 40000], 'scheduler': 'MultiStepLR'} [2024-05-05 21:09:09] [INFO]: Parameters Count: 25.56861M [2024-05-05 21:09:09] [INFO]: Model Initialization Finished! Traceback (most recent call last): File "opengait/main.py", line 73, in
run_model(cfgs, training)
File "opengait/main.py", line 56, in run_model
Model.run_train(model)
File "/root/autodl-tmp/Project/OpenGait/opengait/modeling/base_model.py", line 408, in run_train
retval = model(ipts)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, kwargs)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
output = self._run_ddp_forward(*inputs, *kwargs)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
return module_to_run(inputs[0], kwargs[0])
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/autodl-tmp/Project/OpenGait/opengait/modeling/models/skeletongait++.py", line 90, in forward
assert pose.size(-1) in [44, 48, 88, 96]
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1227) of binary: /root/miniconda3/envs/AllinOne/bin/python
Traceback (most recent call last):
File "/root/miniconda3/envs/AllinOne/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/envs/AllinOne/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launch.py", line 195, in
main()
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launch.py", line 191, in main
launch(args)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launch.py", line 176, in launch
run(args)
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/envs/AllinOne/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Could you help me see what's wrong.Thank you very much and look forward to your early reply.