Use descriptive names for serialization schemas

david-ragazzi commented 9 years ago

I'm working on port serialization of nupic to capnproto and have some questions related to field names. At moment, I'd leave the names as are in order to agilize the review of my PRs. But my ittention is (once serialization job is done but not officially introduced) to create a PR only to uniformize and make more descriptive the fields of all schemas (and in the future of their respective class members) and avoid re-work. For example, currently SpatialPoolerProto uses synPermMax while TemporalMemoryV2Proto (TemporalMemory) uses permanenceMax and TemporalMemoryV1Proto (TP) uses permMax. So maybe it's better we use something like synapsePermanenceMax for all schemas. Furthermore, names like numNewSynapsesPerSegment sounds better than newSynapsesCount in addiction to the infamous "w" in ScalarEncoder. In the begining, this refactoring will be limited to schema fields but the idea is it extend to class members names. I think this should be done as soons as possible (i.e. while new serialization is not official).

david-ragazzi commented 9 years ago

Here is an initial dictionary for schemas where the first column would be the new standardized field and the second column the actual field names.

verbosity -> spVerbosity, tpVerbosity, etc (except when is necessary avoid conflicts) random -> rng, rgen, etc synapse -> syn col -> columns (except when represent a very known computational term like the "col" of a table) numColumns -> numberOfCols, numCols, etc numCellsPerCol -> cellsPerCol maxSegmentsPerCell -> maxSegsPerCell segmentActivationThreshold -> activationThreshold segmentMinThreshold -> minThreshold segmentUpdate -> segUpdate, update, etc sourceCellIdx -> srcCellIdx (in this case source doesn't represent a variable involving copy or linkage operations which we usually have src and dst acronyms) numNewSynapsesForSegment -> newSynapsesCount maxSynapsesPerSegment -> maxSynsPerSeg synapsePermanence -> perm synapsePermanenceInitial -> symPermInit, initPerm, etc synapsePermanenceConnected -> synPermConnected, permConnected, etc synapsePermanenceMax -> synPermMax, permMax, etc synapsePermanenceDecrement -> synPermDec, permDec synapsePermanenceIncrement -> synPermInc, permInc inferenceBacktrack -> infBacktrack learnBacktrack -> lrnBacktrack learnedSequenceLength -> lrnSeqLen learnPredictedState -> lrnPredState inferenceActiveState -> infActiveState numIterations -> iterations, iterationNum, etc numIterationsWithLearning -> iterationLearnNum, etc

Conventions adopted:

Use shorten names only for well known acronyms in computational field such as num, min, max, pct, idx, stats, etc.
Avoid shorten names for specific names in neuroscience and/or HTM-canonical: syn, seg, lrn (learn), inf (infer), seq (sequence), etc.

fergalbyrne commented 9 years ago

Great start. Here's my iteration. key: (short prefix), [collection], nPlural is a number

verbosity -> spVerbosity, tpVerbosity, etc (except to avoid conflicts) random (rand-) -> rng, rgen, etc synapse (syn-) [synapses] -> syn column (col-) [columns (cols)] -> columns nColumns nCellsPerColumn maxSegments activationThreshold minThreshold segmentUpdate // not clear, needs to be improved preSynapticIndex nNewSynapses -> newSynapsesCount maxSynapses -> maxSynsPerSeg permanence -> perm initPermanence -> symPermInit, initPerm, etc connectedPermanence -> synPermConnected, permConnected, etc maxPermanence -> synPermMax, permMax, etc permanenceDec -> synPermDec, permDec permanenceInc -> synPermInc, permInc inferenceBacktrack -> infBacktrack learnBacktrack -> lrnBacktrack learnedSequenceLen -> lrnSeqLen learnPredictedState -> lrnPredState // not sure what this is (in either case) inferenceActiveState -> infActiveState nIterations -> iterations, iterationNum, etc nLearningIterations -> iterationLearnNum, etc

I suggest we use iCell, iCol/iColumn, iSegment etc for looping over arrays (by index), cell, col/column, segment where the loop variable is an actual cell, column, etc.

numenta / nupic-legacy

Use descriptive names for serialization schemas #2170