standardize on ML-specific code conventions

The "coding style" document produced out of the committee stuff did a pretty good job of things (save for the brewing disagreement over module renaming) but in order that we don't end up with think we need to discuss and standardize on things like

naming conventions for method/function arguments, learnable parameters, hand-set hyperparameters
naming conventions for classes
naming conventions for other variables
use of acronyms, greek letters, etc.

There's a fair amount of disagreement about a number of these issues, but I wanted to assemble therm in one place, along with comments for/against. I've tried to keep this post neutral in town and not colour it with my own opinion, but please add comments both expressing your view on a certain issue as well as correcting anything you feel I've unfairly represented.

Method/function arguments

Should they be named simple one letter names (X, y), something longer and more descriptive (inputs, labels), something else entirely?
Arguments for X, y etc.: makes complicated mathematical expressions easier to read, might correspond with usage in paper describing algorithm
Arguments for inputs, labels, etc.: perhaps clearer to novices, complicated expressions could be broken up in terms of temporaries
Learnable parameters
Should these be distinguished in some particular way, other than their membership in the _params attribute? e.g., scikits-learn uses a convention of suffixing all learnable parameter object members with a single underscore, e.g. self.weights_. Of course pylearn2 parameters will be shared variables rather than numpy arrays, so perhaps this particular convention should be avoided to prevent confusion, but should we adopt something similar?
Hyperparameters
Should they be named by their putative function (sparsity_penalty), by the name assigned to them in a particular paper (eta, epsilon), by some other convention? One complicating factor is that lambda is a reserved word in Python, so you can't name an object member lambda (just like you can't name it 'for' or 'else').
Other variables
Should any particular conventions apply?
Class Names
Models are currently typically named their typical name in the literature (DenoisingAutoencoder, ContractiveAutoencoder, BinaryGaussianRBM). Preprocessing varies between verb-phrases (ExtractPatches), noun-phrases (LocalContrastNormalization), etc. Should the object be named as a ``doer'' of whatever it does (e.g. PatchExtractor, LocalContrastNormalizer), the noun phrase of the accomplished processing (PatchExtraction, LocalContrastNormalization), or what?
Typically underscores are left off in favour of camelcase for class names (as denoted in PEP8). Should we adhere to this convention?
Acronyms
Certain acronyms are in wide use and perhaps unambiguous enough (e.g. RBM). Others are overloaded and potentially more opaque (i.e. SMD for "Score Matching Denoising", "Stochastic Meta-Descent", "Stochastic Mirror Descent"). How should we deal with acronyms? Should there be a whitelist (certain acronyms are okay, others aren't)? Or should it be up to the user to dig through docstrings?

Having coding conventions that everyone follows is a gain of time in the long term, due to making it easier to understand and update the code base.

I think it's a bit fascist to make style rules beyond PEP8 / Google Python Style. Please remember that we are all busy scientists. Time spent arguing about underscores (or should we actually make these rules, time enforcing adherence of underscore guidelines) is time not spent doing research or improving our tools.

That entirely depends on our reasons for putting together a library. If the point is to be one or two people's personal playground, then yes, conventions be damned.

If the point is for the library to grow into a well-documented library of readable code that

more than two people are going to use,
will eventually grow beyond the lab,
will be used by people who are not deep learning experts or are new to the field,
will be widely adopted,
have parts spun off into subpackages and/or plugins,
be useful as examples in the classroom or in lecture notes,
have code flowing back and forth into other complementary ML packages,

then yes, all this stuff matters, and I'm annoyed that you try to trivialize it as "arguing about underscores". As far as I can tell, time spent on code quality is time spent improving our tools.

I don't care what anyone does in their own sandbox for their own experiments. They can program in obfuscated Fortran for all I care. I certainly have my share of code that I wouldn't be proud of sharing. If you're going to put it in a library that's intended to be written by more than a couple of people and have it be useful for more than a couple of people, then you should care that it's as accessible as possible at the source level and a "finished product".

NumPy made these mistakes and spent hundreds of man hours and a bunch of University of Florida grant money correcting them. SciPy made these mistakes and is still correcting them, subpackage by painful gruelling subpackage. Theano made these mistakes and is in the awkward situation of growing beyond the lab in its use with mountains of untested, often unreadable and thoroughly undocumented code. All three of these packages are and have been extremely useful to many people but were written by "busy scientists" who refused to see what they were doing as engineering, worth doing right the first time.

Good, library-worthy code is at least, in part, self documenting. That implies code style, that implies convention. That implies the use of intermediate variables so as not to cram too much on one line. That includes descriptive identifiers, or at least a clear convention that when we say X we are talking about a design matrix of dimension 2. That implies reviewed code, where reviews bring up everything from style to semantics to interface to documentation to testing. We haven't been using pull requests to review code for pylearn yet, but we should start, partly because it's important and leads to higher quality code, and partly because it will involve more people in the development process -- others will have to read code anyway in order to contribute in a sensible way, they might as well be reviewing in the process.

Many, many projects have made the decision to accumulate code and "features" at the expense of a high quality codebase, Theano included, and it makes things harder in the long run. I'm not saying that any one of my opinions on the above subjects is the right one but I do think they're all of some importance if we want pylearn2 to be anything more than a pet project full of partially complete, hard to read implementations that are hard to even find because one doesn't know what name to look for. @GaelVaroquaux has been a taskmaster about code style on scikits-learn, and it has paid off -- the code quality of the vast majority of the project is extremely high and contains many implementations I would happily point a colleague to as reference implementations of a particular algorithm, something to which pylearn2 should aspire.

Interesting thread. Hits at the heart of the challenge of developing reusable code, and doing productive research. It's hard to balance and I think they are often conflicting goals. Policies and recommendations are great to have around, but I think judgement on a case by case basis is required.

I think that the following things go together: shared code, code review, good quality, and slow movement.

Sometimes that's good, but not every project should be like that.

If you're trying to do an experiment. You want to be fast, just hack something and try it. You need fast turnaround, so you have your own forks & projects for that. They are comprehensible only to you, and you don't intend to keep them long term.

Then, for whatever reason, you want someone else to use that code... then you have to explain, document, and the other person has to read, and document. I think in practice, that this second stage of code life is the first one where you want to start 'caring about underscores'. It's not a mistake not to have the style right in the first phase... that's normal. In the second stage you should expect to rewrite all your phase 1 code. Don't feel bad about it. It's a hell of a lot faster to rewrite and document code than it is to find an algorithm & design worth sharing in the first place. The result of the phase 2 rewrite should be the sort of thing to get pull-requested to pylearn trunk. No one wants to be stuck doing code review on someone's unintelligible phase 1 hacking, and probably no one will ever need to.

I think it's a bit fascist to make style rules beyond PEP8 / Google Python Style.

Nice, Godwin's law

http://en.wikipedia.org/wiki/Godwin%27s_law

Please remember that we are all busy scientists. Time spent arguing about underscores (or should we actually make these rules, time enforcing adherence of underscore guidelines) is time not spent doing research or improving our tools.

In my opinion, this is a flawed argument: I am a busy scientist, the time I spent trying to memorize where you have put underscores and where you have not is time I spend not doing research.

IMHO, the real reasons for strict coding rules is because:

It forces everybody to use the same conventions: different people end up with similar code without even having to discuss things.
It makes a consistent API. I am to memorize that it is numpy.logaddexp, but numpy.logical_and. This gives an extra cognitive load that gives no added value to the package, and makes it harder for me to learn. And you should remember that the biggest barrier to sharing code is that people hate learning other's packages. That a source of a lot of time wasted and a lot of science not done.

Finally, the reason to choose to underscores is because it makes names more readable. This stroke me the day I was look at a function called mkufunc. I couldn't parse this name, until I realized that it was mk_ufunc, and then it started making sens.

Underscores are just an example. In the scikit, we have found out that the more conventions, the easier it is to have shared ownership of code.

lisa-lab / pylearn2

standardize on ML-specific code conventions #13

Method/function arguments

Learnable parameters

Hyperparameters

Other variables

Class Names

Acronyms