Open usamaa-saleem opened 1 year ago
The GPU i am using is A100
i had met same problem when use conda cudatoolkit when i use pip install torch+cu,it worked. maybe you can try this.
same
pip install torch+cu
What's the cli exactly?
@kohya-ss can you guide me on how to solve this? Need it to be done asap
NotImplementedError: No operator found for `memory_efficient_attention_forward`
with inputs:
query : shape=(200, 9126, 1, 64) (torch.float32)
key : shape=(200, 9126, 1, 64) (torch.float32)
value : shape=(200, 9126, 1, 64) (torch.float32)
attn_bias : <class 'NoneType'>
p : 0.0
`flshattF` is not supported because:
device=cpu (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`tritonflashattF` is not supported because:
device=cpu (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`cutlassF` is not supported because:
device=cpu (supported: {'cuda'})
`smallkF` is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
unsupported embed per head: 64
steps: 0%| | 0/98800 [56:21<?, ?it/s]
@kohya-ssМожете ли вы подсказать мне, как это решить? Нужно сделать как можно быстрее
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(200, 9126, 1, 64) (torch.float32) key : shape=(200, 9126, 1, 64) (torch.float32) value : shape=(200, 9126, 1, 64) (torch.float32) attn_bias : <class 'NoneType'> p : 0.0 `flshattF` is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.bfloat16, torch.float16}) `tritonflashattF` is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.bfloat16, torch.float16}) `cutlassF` is not supported because: device=cpu (supported: {'cuda'}) `smallkF` is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 64 steps: 0%| | 0/98800 [56:21<?, ?it/s]
same problem
Exactly same problem but i running kohya_ss on CPU-only (no GPU).What setting do i missed?