Open suri-kunal opened 1 year ago
What if you try custom injection policy?
Example for GPT neox it would look like
from transformers import GPTNeoXLayer
pipe.model = deepspeed.init_inference(
pipe.model,
dtype=dtype,
mp_size=args.world_size,
replace_with_kernel_inject=False,
enable_cuda_graph=args.graphs,
injection_policy= {GPTNeoXLayer: ('attention.dense','mlp.dense_4h_to_h')}
)
Describe the bug I am trying to finte tune a Transformer EncoderDecoder model on T4. I am using Longformer as Encoder and GPT2 as Decoder for this. While I am successfully able to train the model, while inference, I getting the following error -
AssertionError: AutoTP not supported for model. Please use kernel injection since container policy for model exists.
This is happening because I have set
replace_with_kernel_inject=False
in my init_inference function. If I setreplace_with_kernel_inject=True
I am getting #1301 error which might be because I am running my code on T4.To Reproduce Steps to reproduce the behavior:
Simple inference script to reproduce Training Loop -
validate_summarization -
Stacktrace -
Expected behavior How do I get rid of this error? My target environment is K80 and so getting rid of this solution is extremely important to me.
ds_report output Please run
ds_report
to give us details about your setup.Screenshots If applicable, add screenshots to help explain your problem.
System info (please complete the following information):
Docker context Dockerfile -
Requirements.txt
Additional context Add any other context about the problem here.