Closed HDembinski closed 4 years ago
It's not too late, although it's getting close. Once the documentation is written, there will be a higher barrier to changing names.
These particular examples don't have Awkward1 equivalents, so there's nothing to do here.
I used to snake_case everything, and there was also a period when I camelCased everything. Most recently, I was encouraged to snake_case only when the words are "really long," which is a subjective assessment. I do have a rough cut-off, based on how hard it is for me to read it. If something is going to be typed often, like a command, I try to pick words that are legible when joined—following what I see in the Python standard library. (The fact that the Python standard library does it is probably why PEP8 doesn't put down a hard rule.) Some examples include ak.withfield
and ak.withname
, ak.fromnumpy
, ak.fromiter
, ak.fromjson
, ak.tonumpy
, ak.tolist
, and ak.tojson
. On the other hand, there's ak.regularize_numpyarray
, which clearly needs the underscore. The ones that I feel can be joined without an underscore are part of a pattern: withX
, fromX
, or toX
, so that we train our eyes to that pattern. If it can't be expressed that way, it becomes a candidate for snake_case.
That was my thought process, anyway.
FWIW pandas uses snake even for from_X and to_X, e.g. https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#serialization-io-conversion
FWIW Ivery much prefer to_X, from_X, etc. ;-).
It sounds like there's a strong consensus on this—I'll tighten my tolerance on acceptability of non-underscored names enough that words like "from", "to", and "with" require underscores and report back with a list of new names. This is about the latest that such a name-change can be done, since the code hasn't really reached users yet. I'll post the updated list of names today.
In https://github.com/scikit-hep/awkward-1.0/pull/194/commits/d125ad78a083af8a29c8ff32167eaeea77458aa8, I changed:
function ak.fromnumpy → ak.from_numpy function ak.fromiter → ak.from_iter function ak.fromjson → ak.from_json function ak.fromawkward0 → ak.from_awkward0 function ak.tonumpy → ak.to_numpy function ak.tolist → ak.to_list function ak.tojson → ak.to_json function ak.tolayout → ak.to_layout function ak.toawkward0 → ak.to_awkward0 function ak.validityerror → ak.validity_error function ak.isvalid → ak.is_valid function ak.withfield → ak.with_field function ak.withname → ak.with_name function ak.tomask → ak.mask function ak.typeof → ak.type parameter checkvalid → check_valid parameter withname → with_name parameter validwhen → valid_when method numbatype → numba_type method beginlist → begin_list method endlist → end_list method begintuple → begin_tuple method endtuple → end_tuple method beginrecord → begin_record method endrecord → end_record
But I didn't change:
function ak.isna function ak.notna function ak.argcross function ak.argchoose method bytestring
You can see the new names in the left-bar of https://awkward-array.readthedocs.io .
Did ak.withname
get missed? I'd also advocate for ak.linear_fit
instead of ak.linearfit
.
withname
got missed, but I noticed it in my current PR and fixed it: with_name
. I also found a few more instances of withfield → with_field
(as a parameter, rather than a function name).
I can do linear_fit
. I'll put it into my current PR.
It's now linear_fit
and mask_identity
: https://github.com/scikit-hep/awkward-1.0/pull/195/commits/43fe180508e6a4addfa596338184a235b95ec1ce (or it will be when scikit-hep/awkward-1.0#195 is done).
Python uses snake_case for most methods. awkward-array has some long all-lowercase public methods in some classes without spaces, which I find very hard to read. Examples:
I suppose this style originates from numpy, some examples can be found here: https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html
To my surprise, PeP8 does not prohibit this style, but at least discourages it: https://www.python.org/dev/peps/pep-0008/#function-and-variable-names "Function names should be lowercase, with words separated by underscores as necessary to improve readability."
I personally can see no advantage in naming something "offsetsaliased" over "offsets_aliased", the latter is clearly more readable. Perhaps it is not too late for awkward1 to do this better?