apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.79k forks source link

[RFC] Future of Apache MXNet #21206

Open szha opened 1 year ago

szha commented 1 year ago

Dear MXNet community,

I would like to start a conversation about the future of Apache MXNet and where we should head next.

What we built

Apache MXNet is an open-source deep learning framework used to train and deploy deep learning models developed by contributors from multiple organizations. MXNet is known for its efficiency at scale. MXNet supports multiple languages including Python, Scala, R, Julia, Perl, and more, which makes it accessible to a wide variety of developers and data scientists. Its core is written in C++ for performance, but it provides a flexible interface that allows users to write code in their preferred language.

Another important feature of MXNet is its support for both imperative and symbolic programming styles. Imperative programming (like PyTorch) is more intuitive and flexible, allowing users to write code as they would in standard Python, whereas symbolic programming (like Theano, TensorFlow) is more efficient in terms of runtime and memory usage and allows for certain optimizations like graph-level optimizations and auto-differentiation. This allows users to choose the programming style that best suits their needs. Gluon interface further attempts at unifying these paradigms.

As an attempt to address the legacy issues, the community started working on the development of MXNet 2.0 in 2020. Some of the updates include a new design for data loading in Gluon, a unified distributed data parallel interface, parameterizable probability distributions, a refactored MXNet np interface, and enhanced support for 3rd-party functionality. We've also improved the development process with a new CMake build system, a memory profiler, and more Pythonic exception handling.

Along the journey, many people who share the enthusiasm about deep learning joined our cause, and we managed to develop a community with 875 contributors, 87 committers and 51 PMC members. Many of the community members continue to play important roles outside of MXNet in the generative AI and deep learning system spaces. We graduated from Apache Incubator in Sept 2022 to a top-level project. Our project is the culmination of the hard work of many people over the years.

Where we are

Since late 2022 the code development has mostly halted and community engagement slowed. Despite the boom in generative AI and related spaces such as deep learning frameworks and distributed training solutions, the project's current positioning is not enough in sustaining the growth of the project, especially in light of the development in the open source deep learning framework space.

What can we do

There are a few choices that we as a community can make: 1) we can continue on the current path by finding a critical mass to continue drive maintenance 2) we can discuss alternative positions that MXNet can pivot to so that this community can bring value beyond existing offerings. 3) we can retire as an Apache TLP into Attic.

justinmclean commented 1 year ago

It would be in the best interest of the community to discuss this. Given the low activity level and lack of discussion, I can't see how either 1 or 2 is possible. So from those options, it looks like the attic is the only path, but I think there are another two options:

  1. Existing community members step up and are willing to contribute and provide project oversight.
  2. Retire but continue development outside of Apache. If nothing further happens, I'll be recommending that the ASF Board votes for the project's retirement at the ASF Board September meeting.
EmilPi commented 1 year ago

I personally think that Apache MXNet is very competitive framework in terms of performance and ease of use. I had lots of cases when I just could not run larger models on PyTorch and TensorFlow, but MXNet fit into GPU constraints. Pity that major orgs do not use it extensively and do not support it. I would like to give in more, but I lack skills.

justinmclean commented 1 year ago

Unless the wider community is interested in reforming the PMC, the ASF board will vote on the retirement of MXNet at the next board meeting.

terrytangyuan commented 11 months ago

image

lgov commented 11 months ago

It would be great if - as a step in the procedure of moving to the Attic - this GitHub project would also be updated. For instance:

The goal is to make people not waste time on a project that's not maintained anymore.