DynaSim / DynaSim

DynaSim toolbox for modeling and simulating dynamical systems
https://dynasim.github.io
MIT License
61 stars 32 forks source link

Create and enforce Development Style Guide #19

Closed asoplata closed 7 years ago

asoplata commented 8 years ago

In the, um, forthcoming documentation, or at least somewhere for the time being, we should agree to follow a Style Guide for both the code and activity on the Git repository.

Code

When I say code, I also mean documentation specifically. Once I implement the Sphinx-style documentation format for all the actual MATLAB code, I expect us to stick to that format. In any case, I'll eventually provide a bash script to build the documentation automatically, and running that before you commit that should report any doc gen errors if you've made a mistake.

In terms of actual MATLAB programming style, I think this is less important than documentation or Git style. Feel free to add to the discussion of what this should be. So far, I think following Jason's style as it exists is a good idea, and he's been programming with a few principles in mind: KISS (Keep It Simple...), and I'd add a function should do one thing, and do it well.

Git

Commit style

Tim Pope has one of the most succinct explanations of what have become Git-commit-best-practices across much of the internet. The main things are

  1. Keep the main commit title/message at or under 50 characters! DO THIS or else GitHub will NOT interpret/display the commit message correctly! (E.g. here, where the commit title runs over into the description)
  2. Use "imperative verbs" in the commit titles like "Fix documentation". This way, when you act on those commits, it's like you're doing to the code what the commit says.
  3. Follow up a short commit title with longer, explanatory text after an empty line below, and preferably wrap this text to 72 characters so it displays nicely everywhere (and so people don't have to click on the horizontal slider thing to read your stuff).

Here is more best practice information than you could ever wish for.

Collaboration / Workflow

Jason and I discussed this at length back in the prehistoric days of DNSIM, but because this is a smaller project, (<10 active developers) we initially decided that a simplified git-flow model as shown in full model with just a master, dev, and individual/issue-based feature branches were the way to go, the only additional constraint being that you should submit a Pull Request before merging back into dev. Anyone can publicly make a Pull Request, of course, but keeping the work on branches inside a single repo makes it easier for a small team to make bigger changes themselves, e.g. Jason doesn't himself have to accept all Pull Requests. This method generates a lot of information about the development path, which we want since otherwise it can be unclear at times why someone is doing something, and we're not dealing with so much development that understanding all the changes simultaneously is a problem. This is sort of a merge of the git-flow model and the confusingly similar Github-flow model.

That said, we aren't married to this model, and the branches in DNSIM kind of...spiraled out of control as people made user-specific and archaic branches that eventually had little hope of being merged back into master or dev, not to mention the fact that, because there were no clear milestones, it was never clear at what points work would be done to merge the active dev branch into the dead master. Learning from that experience, using the git-flow model as we are currently supposing places much emphasis on having few, large updates to the master branch that are supposed to be dependable, tested, production-ready updates. Jason's medium-term software plans for DynaSim more easily support this model.

If we were adding significant features all the time, and had a very good automated testing apparatus like Travis CI a la "continuous integration" (the closest thing for MATLAB seems to be something called Jenkins), then the more willy-nilly, constant-changes-to-master-everyday "Github-flow" model would be better, but this requires much more discipline about handling merge conflicts all the time etc. We're open to ideas about any model.

One possible change to the "git-flow" model is to do the "public GitHub way", and that IIRC GitHub promotes: instead of being added as admin to a specific repo, you "fork" a copy of that repo, and then when you want to commit changes to the original repo "upstream", you send a public pull request. This would make it easier for people to make their own personalizing changes to their copy of the DynaSim code while still being able to selectively send back helpful update code. The problem with this...is the benefit from it: people may find it so easy to personalize their code but never send anything upstream that everyone ends up running fragmented copies of the different, irreconciliable code versions. You can always pull changes from the original repo into your fork, but if the history of the DNSIM commits are any indication, it's very possible people will end up writing a whole bunch of nice code that works with their custom DynaSim...only to have it be so incompatible that it never makes it back to the core, "upstream". Doing things on a "branch" instead of "fork" style helps with this, although it does require us to enforce coherence among the branches, i.e. no weird custom personalized version -branches. If something requires strong personalization, then that means we need to have an agreed-upon way to personalize that that EVERY install should follow. E.g., require a bash/CLI environment variable to be set when you install that determines where your mechanism files are, or whatever. One-off personalizations make it harder for other people to use that code, and any effort into individual personalizations could instead be put into helping to make it work for everyone, a la communist utopia-style!

One thing we haven't worked through is to clear the separation between what is considered part of the canonical repo vs. what do individual users do with private, individual-user specific code like custom private mechanisms or analysis code. You can't have two different git repos in the same directory IIRC, but we do want to provide a DynaSim install with some starter cell mechanisms, etc...but we also want to make it easy for people to share them without having to worry about conflicts/etc. with the DynaSim GitHub repo. How we manage this will probably impact our choice of Git development.

asoplata commented 7 years ago

This has been created on the wiki here.