microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
34.62k stars 4.04k forks source link

[REQUEST] Any plan to refactor those huge classes? #1650

Open ghosthamlet opened 2 years ago

ghosthamlet commented 2 years ago

I don't know whether DeepSpeed is planning to do large refactor, if not, below is my suggestion.

Is your feature request related to a problem? Please describe. Thanks for this great project, DeepSpeed is wonderful incredible powerful. But i found its python codes full of large even huge classes, as DeepSpeed turned more powerful, classes become larger and hard to read or maintain.

Describe the solution you'd like I think Instead of adding more new features, it is better to do more refactor, or in the near future, DeepSpeed will be hard to maintain, hard to debug or add even little new feature. DeepSpeed is mainly manipulating data/weight, if refactor it by DOP(data oriented programming), the code should be much better to read and maintain. There is a good book of data oriented programming: https://livebook.manning.com/book/data-oriented-programming/chapter-1/v-13/

Describe alternatives you've considered DOP can live with OOP, so refactor can go on many steps. Or just refactor the large classes to smaller modular classes should be also great. I found fairscale(https://github.com/facebookresearch/fairscale) has similar functions with DeepSpeed, but its codes are much simpler.

tjruwase commented 2 years ago

@ghosthamlet, thanks for this suggestion. Refactoring is very high on our desired list, but unfortunately the bandwidth is lacking for it. If you are able to put some effort in this direction, we will gladly appreciate and support it.

ghosthamlet commented 2 years ago

Sorry, I still don't understand all the details of DeepSpeed code now, it's complex, and this year i am too busy as i am the only one ML researcher in my company(indeed the company is too tiny just has two person, boss and me). When have more spare time i wish to study more DeepSpeed code and then maybe try to do some refactor.