Need Script/Example/API-specific Contribution Guidelines

szha commented 6 years ago

We currently have different implicit requirements for scripts, examples, and APIs, but our contribution guideline is still a generic how-to. https://github.com/dmlc/gluon-nlp/blob/master/docs/how_to/contribute.rst

Let's discuss the requirements for each of these types of contributions so that we can document them in our contribution guidelines.

@sxjscience @szhengac @leezu @cgraywang @astonzhang @eric-haibin-lin @piiswrong @mli

eric-haibin-lin commented 6 years ago

+1. Will be useful for accepting SoC code.

szha commented 6 years ago

@eric-haibin-lin made the following guidelines for notebooks:


Notebook Guideline: |   |  
-- | -- | --
  |   |  
- Less is better. Only show the code that needs people's attention. |   |  
- Try to have < 10 lines of code per block, < 100 lines of code per notebook. |   |  
- Hide uninteresting complex functions in .py and import them |   |  
- Hide uninteresting model parameters. We can make some of them default parameters in model definition. Maybe out of 30 we just show 5 interesting ones and pass those to model constructor. |   |  
- Have a block upfront about the key takeaway of the notebook. |   |  
- Only import module instead of classes and functions (i.e. from gluonnlp import model and use model.get_model, instead of from gluonnlp.model import get_model) |   |  
- Make tutorials more interative, prepare practice questions for people to try it out. For example, for embedding evaluation, we can ask questions to the audience like what's the most similar word to xxx. |   |  
- Make sure the notebook can be zoomed in and renders well, so that people can see it clearly from the back of the room (this usually means a row character limit of 80 or less) |   |  
- For low level APIs such as BeamSearch and Scorer, explain the API with examples so ppl know how to play with it / hack it. |   |  
- Explain the motivation of the notebook to guide readers. Add figures if they help. |   |  
  |   |  
For example, the following code block can be reduced by specifying default parameters and showing the interesting ones: |   |  
num_units = 512hidden_size = 2048dropout = 0.1epsilon = 0.1num_layers = 6num_heads = 8scaled = Trueencoder, decoder = get_transformer_encoder_decoder(units=num_units,                                                   hidden_size=hidden_size, dropout=dropout,                                                   num_layers=num_layers, num_heads=num_heads,                                                   max_src_length=530, max_tgt_length=549,                                                   scaled=scaled) |   |  
V.S. |   |  
num_layers = 6num_heads = 8scaled = Trueencoder, decoder = get_transformer_encoder_decoder(num_layers=num_layers,                                                   num_heads=num_heads,                                                   scaled=scaled)

astonzhang commented 6 years ago

Try to have < 10 lines of code per block, < 100 lines of code per notebook.

Not sure if this is feasible.

eric-haibin-lin commented 6 years ago

@astonzhang That was added in the context of KDD notebook presentation. Otherwise the code block doesn't fit in one screen. We can relax it a bit when it comes to notebooks on the website.

leezu commented 6 years ago

I suggest to follow the spirit of the guidelines on the website too. Too long notebooks on the website will probably not be read anyways and may just scare potential users.

szha commented 6 years ago

I'm incorporating it in #383

dmlc / gluon-nlp

Need Script/Example/API-specific Contribution Guidelines #269