Open chintan-donda opened 1 year ago
To train with V100, we need enable_mem_efficient, otherwise, the above error is shown.
--- a/falcontune/model/falcon/model.py
+++ b/falcontune/model/falcon/model.py
@@ -523,7 +523,7 @@ class Attention40B(nn.Module):
key_layer_ = key_layer.reshape(batch_size, self.num_heads, -1, self.head_dim)
value_layer_ = value_layer.reshape(batch_size, self.num_heads, -1, self.head_dim)
- with torch.backends.cuda.sdp_kernel(enable_flash=True, enable_math=False, enable_mem_efficient=False):
+ with torch.backends.cuda.sdp_kernel(enable_flash=True, enable_math=False, enable_mem_efficient=True):
attn_output = F.scaled_dot_product_attention(
query_layer_, key_layer_, value_layer_, None, 0.0, is_causal=True
)
Getting below error when trying to finetune the model.
Experimental setup details: OS:
Ubuntu 18.04.5 LTS
GPU:Tesla V100-SXM2-32GB
Libs:Finetuning command:
Any help please?