Open n2729648074 opened 1 month ago
@n2729648074 Have you found the problems? I do not have envs in hand reproducing the bugs :-(
Has it been resolved?
@cgz6498 Hi, can you reproduce the problems when running the example code in https://github.com/sustcsonglin/flash-linear-attention/tree/main/training
Describe the bug
Thank you very much for your excellent work! When I train with multi-GPU, the autotuner.py function in triton pops up full_nargs = {self.nargs, kwargs, **self.best_config.kwargs} TypeError: 'NoneType' object is not a mapping error
but when i train with single-GPU, the error doesn't trigger. So, I'd like to ask you how to train in parallel with multi-GPU without errors
Steps to reproduce the bug
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/layers/gsa.py", line 140, in forward hidden_states = self.norm(hidden_states) File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 659, in forward return rms_norm_fn( File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 526, in rms_norm_fn return LayerNormFn.apply( File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/autograd/function.py", line 539, in apply return super().apply(args, kwargs) # type: ignore[misc] File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/utils.py", line 12, in wrapper return fn(ctx, File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 415, in forward y, mean, rstd, residual_out = _layer_norm_fwd( File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 172, in _layer_norm_fwd _layer_norm_fwd_1pass_kernel[(M,)]( File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 143, in run timings = {config: self._bench(*args, config=config, *kwargs) for config in pruned_configs} File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 143, in
timings = {config: self._bench( args, config=config, kwargs) for config in pruned_configs}
File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 104, in _bench
full_nargs = {self.nargs, **current}
TypeError: 'NoneType' object is not a mapping
Expected behavior
I'd like to ask you how to train in parallel with multi-GPU without errors
Environment info