facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.38k stars 1.18k forks source link

RuntimeError: invalid unordered_map<K, T> key #692

Open visin109 opened 5 months ago

visin109 commented 5 months ago

MViT( (patch_embed): PatchEmbed( (proj): Conv3d(3, 96, kernel_size=(3, 7, 7), stride=(2, 4, 4), padding=(1, 3, 3)) ) (blocks): ModuleList( (0): MultiScaleBlock( (norm1): LayerNorm((96,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (proj): Linear(in_features=96, out_features=96, bias=True) (pool_q): Conv3d(96, 96, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=96, bias=False) (norm_q): LayerNorm((96,), eps=1e-06, elementwise_affine=True) (pool_k): Conv3d(96, 96, kernel_size=(3, 3, 3), stride=(1, 8, 8), padding=(1, 1, 1), groups=96, bias=False) (norm_k): LayerNorm((96,), eps=1e-06, elementwise_affine=True) (pool_v): Conv3d(96, 96, kernel_size=(3, 3, 3), stride=(1, 8, 8), padding=(1, 1, 1), groups=96, bias=False) (norm_v): LayerNorm((96,), eps=1e-06, elementwise_affine=True)............................ ....... (dropout): Dropout(p=0.5, inplace=False) (projection): Linear(in_features=768, out_features=400, bias=True) (act): Softmax(dim=1) ) ) Got Model architecture as intermediate part of output, truncated some parts

Traceback (most recent call last): File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\pdb.py", line 1726, in main pdb._runscript(mainpyfile) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\pdb.py", line 1586, in _runscript self.run(statement) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\bdb.py", line 597, in run exec(cmd, globals, locals) File "", line 1, in File "c:\users\p.vijay srinivasan\downloads\nyx wolves\shoplifting_2\slowfast\tools\run_net.py", line 55, in main() File "c:\users\p.vijay srinivasan\downloads\nyx wolves\shoplifting_2\slowfast\tools\run_net.py", line 28, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "C:\Users\P.Vijay Srinivasan\Downloads\NYX Wolves\shoplifting_2\SlowFast\slowfast\utils\misc.py", line 430, in launch_job func(cfg=cfg) File "C:\Users\P.Vijay Srinivasan\Downloads\NYX Wolves\shoplifting_2\SlowFast\tools\train_net.py", line 543, in train flops, params = misc.log_model_info(model, cfg, use_train_input=True) File "C:\Users\P.Vijay Srinivasan\Downloads\NYX Wolves\shoplifting_2\SlowFast\slowfast\utils\misc.py", line 189, in log_model_info flops = get_model_stats(model, cfg, "flop", use_train_input) File "C:\Users\P.Vijay Srinivasan\Downloads\NYX Wolves\shoplifting_2\SlowFast\slowfast\utils\misc.py", line 168, in get_model_stats countdict, * = model_stats_fun(model, inputs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\fvcore\nn\flop_count.py", line 147, in flop_count for op, flop in flop_counter.by_operator().items(): File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\fvcore\nn\jit_analysis.py", line 265, in by_operator stats = self._analyze() File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\fvcore\nn\jit_analysis.py", line 551, in _analyze graph = _get_scoped_trace_graph(self._model, self._inputs, self._aliases) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\fvcore\nn\jit_analysis.py", line 176, in _get_scoped_tracegraph graph, = _get_trace_graph(module, inputs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\jit_trace.py", line 1285, in _get_trace_graph outs = ONNXTracedModule( File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\jit_trace.py", line 133, in forward graph, out = torch._C._create_graph_by_tracing( File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\jit_trace.py", line 124, in wrapper outs.append(self.inner(trace_inputs)) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1568, in _call_impl result = forward_call(*args, kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1508, in _slow_forward result = self.forward(*input, kwargs) File "C:\Users\P.Vijay Srinivasan\Downloads\NYX Wolves\shoplifting_2\SlowFast\slowfast\models\video_model_builder.py", line 1215, in forward x, bcthw = self.patch_embed(x) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1568, in _call_impl result = forward_call(args, kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1508, in _slow_forward result = self.forward(*input, *kwargs) File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\fairscale\nn\checkpoint\checkpoint_activations.py", line 191, in _checkpointed_forward output = CheckpointFunction.apply( File "C:\Users\P.Vijay Srinivasan\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autograd\function.py", line 539, in apply return super().apply(args, **kwargs) # type: ignore[misc] RuntimeError: invalid unordered_map<K, T> key Uncaught exception. Entering post mortem debugging Running 'cont' or 'step' will restart the program

c:\users\p.vijay srinivasan\appdata\local\programs\python\python310\lib\site-packages\torch\autograd\function.py(539)apply() -> return super().apply(*args, **kwargs) # type: ignore[misc]

RuntimeError: invalid unordered_map<K, T> key

Data for reference: config file

weights-(https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/mvitv2/pysf_video_models/MViTv2_S_16x4_k400_f302660347.pyth) the file run_net.py calls test function inside it which is from the test_net.py. Have a look at those files too. The error occurs after the build_model(cfg) and cu.load_test_checkpoint(cfg, model) is successful.

Expected behavior: Inference should run successfully. I'm trying to run inference MVIT2 model for the above configuration using the run_net.py. The code runs on various dependencies from the repository. If you look at the provided run_net.py script, code works till model.eval() point and after that I'm getting the following error mentioned. I even tried by putting torch.save() before model.eval() but the issue got reproduced.

alpargun commented 4 months ago

I'm afraid this can be a problem related to Windows OS as I see in other similar issues, e.g. [1] and [2]).

What is your PyTorch version? Maybe a different version can help. Also did you change anything else other than the checkpoint path in the config file? Can you maybe provide your own config?