microsoft / onnxruntime-training-examples

Examples for using ONNX Runtime for model training.
MIT License
311 stars 62 forks source link

flip contiguous_gradients flag #65

Closed jingyanwangms closed 3 years ago

jingyanwangms commented 3 years ago

We found turning contiguous_gradients flag to false improves performance for all models. Deepspeed folks has confirmed this flag is important for large models but should not make a difference for models that fits base pytorch.