apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

[RFC] Documentation of MXNet #3504

Closed pluskid closed 7 years ago

pluskid commented 8 years ago

There has been some complaints about the documentation quality of MXNet. We actually have a lot of documents, but they are a bit scattered and maybe difficult to locate. In this issue, we attempt to make a framework so that we can better re-organize the existing documents, and provide guidelines for people to contribute more documents.

Please feel free to add your comments and opinions below to help make the documentation better.

Main TODOs

The organization will be similar to the existing system.

The goal of Introduction is to get an overview of MXNet, and a portal to get started (download & installation).

Each tutorial should be step-by-step detailed document for something. A tutorial is an entry point of a new user without prior experience of MXNet. It should

iPython notebook might be a good format for detailed runnable code with documentations. But we should add the scripts to our regression test using tools (e.g. runipy) that could run notebooks from the command line.

HowTos are targeting users with basic understanding of MXNet and would like to do a specific thing with it. Organized by tasks. We have a rich examples repository that we could re-use.

For runable examples, the documentation can be a README.md in that specific folder.

Organizations

Each component ideally should have

It would be very helpful if each API function could have a brief example of how that function is called, especially for those automatically generated functions. The Concat operator is a good example. The document says it takes data as Symbol[] and num_args. For people not familiar with the calling convention, it is hard to figure out how to use this operator. Having the examples below greatly improves the document. It does not need to be a fully runnable example, a brief line like

cat = mx.sym.Concat(a, b, dim=dim)

is already useful. Currently the extra doc for operators for Python can be attached via symbol_doc.py. Some random example of good API reference with brief examples include: Python, numpy, Lasagne, Keras, etc.

FAQ

Existing system design doc, NNVM, etc.

We also need a doc for the instructions of making a release:

RogerBorras commented 8 years ago

It would be great if you can do the examples in R too!! :)

vchuravy commented 8 years ago

If possible the examples should be run as part of the test-suite so that we can make sure that they will always work.

tornadomeet commented 8 years ago

MXNet doc should be putted to one whole website, the catalog of this website should be organization very convenient for user to search/rquery, we can study from other dl libaray docs.

the following docs are done very good to me: tensorflow : https://www.tensorflow.org/ lasagne: https://lasagne.readthedocs.io/en/latest/ PaddlePaddle: http://www.paddlepaddle.org/ keras: https://keras.io/

and the api doc we can take example by torch nn convlution: https://github.com/torch/nn/blob/master/doc/convolution.md

pluskid commented 8 years ago

@vchuravy Yes, doctest might be a good idea.

VoVAllen commented 8 years ago

Current mxnet doc website's style is not so friendly. There's no clear separate between input variable and output variable. I suggest using the default style such as keras in the final doc.

piiswrong commented 8 years ago

There are Amazon folks working on this . Better coordinate. @sandeep-krishnamurthy

sandeep-krishnamurthy commented 8 years ago

@pluskid: Great set of points. Doing this would greatly improve user experience for MXNet users. I agree with most of your suggestions on organizing the content and adding new content. Me along with some more people at Amazon are working on coming up with a proposal on improving docs page navigation and organization of content.

Most of our current plan overlaps with your suggestions. I will be adding a correspondence here with more details (combining ideas we have and people's suggestion in this thread) by end of day Friday.

At high level I would like to take this forward in multiple steps as below:

We can create separate issues for each of above tasks and track them?

Do let me know your thoughts and suggestions team.

sandeep-krishnamurthy commented 8 years ago

Top Level

Get Started

Tutorials

Try to add more conceptual details.

How To Needs to be organized as suggested by pluskid above FAQ Remove build and install. It is redundant Few things can move to How to section. Remove relation to CXXNet, Minerva, Purine2 ? Probably this section can then be renamed to "Troubleshooting" ?

API

Deep Learning Concepts Currently we have common deep learning concepts in Architecture page. Move them here.

Architecture Have MXNet system specific details here. Also, have sections on code walk through etc. Objective is to enable readers to understand system followed by architecture of mxnet and then link them to appropriate code sections.

Community

I will create separate issues to track these tasks. Do let me know what you guys think about the proposal.

We can iteratively fix section by section.

sandeep-krishnamurthy commented 8 years ago

Submitted PR for

https://github.com/dmlc/mxnet/pull/3526

sandeep-krishnamurthy commented 8 years ago

We should also have a Glossary section. A line description and probably link to more detailed documentation for that term (if available)

piiswrong commented 8 years ago

Deep Learning concept seems too verbose.

BTW, can someone who know about CSS/HTML help adjust the font/linespacing for our docs? I think we need a different font and larger linespacing (esp between title and content). Referece: https://keras.io/layers/core/

sandeep-krishnamurthy commented 8 years ago

Then may be "Deep Learning Basics"? Or "Basics"?

pluskid commented 8 years ago

I feel "Deep learning concepts" and "Architecture" can be merged. Because they are both describing the mxnet internals.

pluskid commented 8 years ago

@VoVAllen I share the same opinion with you. Currently it is quite confusing which part is parameters and which part is return values.

sandeep-krishnamurthy commented 8 years ago

Major re-organization of content in "get_started", "tutorials", "architecture", "community" pages submitted. PR - https://github.com/dmlc/mxnet/pull/3545

pluskid commented 8 years ago

We now have a HowTo tag for issues. We will be tagging issues that looks like common practice with this, and periodically go through the tagged issues and summarize the solutions into the howto or FAQ section of the doc.

statist-bhfz commented 7 years ago

I would suggest bookdown package for creating tutorials. It looks very nice for me, and you can find great examples like https://topepo.github.io/caret/index.html