Closed jamesdavidson closed 8 years ago
Great suggestion! Where do you think this info is best stored?
There are basically three options: as text files in the repo, as comments in the source or as pages in the wiki. Let's start with text files in the repo. I've opened PR #9
I've just started putting together some auto-generating documentation via Sphinx. For it to work properly, the docstrings for all functions will need to be updated, etc.. More progress, anyway!
The ReadTheDocs, which uses Sphinx, is a fairly substantial documentation of the most relevant parts of corpkit. Closing!
Looking at this as a programmer, the first thing I want to understand is the data structures. What are the inputs, outputs and intermediaries? Usually all that's necessary for this kind of documentation is a sketch of how the various entities are mapped to basic constructs like sets, lists, maps, tuples, booleans, numbers, strings, symbols or nested variants of the same (ie trees and the like). And perhaps a note about how they get encoded in files (ie CSV, Penn Treebank or Python pickles).
For example, I jotted down some notes (https://github.com/jamesdavidson/corpkit/blob/hacking/DATA.md) whilst going through the code. If you'd like, I can help you write this kind of documentation.