Add support for deep speed

DeepSpeed does not want to run on our GPU machine since the fused_adam op cannot be compiled, neither in JIT nor in pre-compiled mode. I tried various versions of deepspeed and various versions of PyTorch. The only variable I can think of at this point is the cuda/nvvm version that is installed on our machine.

Since we can currently train on an A100 GPU without needing deepspeed, we put this issue on hold.

jina-ai / jerboa

Add support for deep speed #119