evanatyourservice / kron_torch

An implementation of PSGD Kron second-order optimizer for PyTorch
Creative Commons Attribution 4.0 International
16 stars 2 forks source link

Distributed Training? #1

Open skyshine102 opened 3 weeks ago

skyshine102 commented 3 weeks ago

Does psgd kron optimizer work with FSDP or Deepspeed?

opooladz commented 2 weeks ago

It certainly should, but not yet in torch, the Jax version is good to go. We're working on a distributed torch one right now.

skyshine102 commented 2 weeks ago

Looking forward to the feature! I'm torch user and I can test when you have initial release.