scikit-hep / awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.
BSD 3-Clause "New" or "Revised" License
215 stars 39 forks source link

awkward method names #240

Closed HDembinski closed 4 years ago

HDembinski commented 4 years ago

Python uses snake_case for most methods. awkward-array has some long all-lowercase public methods in some classes without spaces, which I find very hard to read. Examples:

offsetsaliased
startsstops2parents

I suppose this style originates from numpy, some examples can be found here: https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html

To my surprise, PeP8 does not prohibit this style, but at least discourages it: https://www.python.org/dev/peps/pep-0008/#function-and-variable-names "Function names should be lowercase, with words separated by underscores as necessary to improve readability."

I personally can see no advantage in naming something "offsetsaliased" over "offsets_aliased", the latter is clearly more readable. Perhaps it is not too late for awkward1 to do this better?

jpivarski commented 4 years ago

It's not too late, although it's getting close. Once the documentation is written, there will be a higher barrier to changing names.

These particular examples don't have Awkward1 equivalents, so there's nothing to do here.

I used to snake_case everything, and there was also a period when I camelCased everything. Most recently, I was encouraged to snake_case only when the words are "really long," which is a subjective assessment. I do have a rough cut-off, based on how hard it is for me to read it. If something is going to be typed often, like a command, I try to pick words that are legible when joined—following what I see in the Python standard library. (The fact that the Python standard library does it is probably why PEP8 doesn't put down a hard rule.) Some examples include ak.withfield and ak.withname, ak.fromnumpy, ak.fromiter, ak.fromjson, ak.tonumpy, ak.tolist, and ak.tojson. On the other hand, there's ak.regularize_numpyarray, which clearly needs the underscore. The ones that I feel can be joined without an underscore are part of a pattern: withX, fromX, or toX, so that we train our eyes to that pattern. If it can't be expressed that way, it becomes a candidate for snake_case.

That was my thought process, anyway.

nsmith- commented 4 years ago

FWIW pandas uses snake even for from_X and to_X, e.g. https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#serialization-io-conversion

eduardo-rodrigues commented 4 years ago

FWIW Ivery much prefer to_X, from_X, etc. ;-).

jpivarski commented 4 years ago

It sounds like there's a strong consensus on this—I'll tighten my tolerance on acceptability of non-underscored names enough that words like "from", "to", and "with" require underscores and report back with a list of new names. This is about the latest that such a name-change can be done, since the code hasn't really reached users yet. I'll post the updated list of names today.

jpivarski commented 4 years ago

In https://github.com/scikit-hep/awkward-1.0/pull/194/commits/d125ad78a083af8a29c8ff32167eaeea77458aa8, I changed:

function ak.fromnumpy → ak.from_numpy function ak.fromiter → ak.from_iter function ak.fromjson → ak.from_json function ak.fromawkward0 → ak.from_awkward0 function ak.tonumpy → ak.to_numpy function ak.tolist → ak.to_list function ak.tojson → ak.to_json function ak.tolayout → ak.to_layout function ak.toawkward0 → ak.to_awkward0 function ak.validityerror → ak.validity_error function ak.isvalid → ak.is_valid function ak.withfield → ak.with_field function ak.withname → ak.with_name function ak.tomask → ak.mask function ak.typeof → ak.type parameter checkvalid → check_valid parameter withname → with_name parameter validwhen → valid_when method numbatype → numba_type method beginlist → begin_list method endlist → end_list method begintuple → begin_tuple method endtuple → end_tuple method beginrecord → begin_record method endrecord → end_record

But I didn't change:

function ak.isna function ak.notna function ak.argcross function ak.argchoose method bytestring

jpivarski commented 4 years ago

You can see the new names in the left-bar of https://awkward-array.readthedocs.io .

chrisburr commented 4 years ago

Did ak.withname get missed? I'd also advocate for ak.linear_fit instead of ak.linearfit.

jpivarski commented 4 years ago

withname got missed, but I noticed it in my current PR and fixed it: with_name. I also found a few more instances of withfield → with_field (as a parameter, rather than a function name).

I can do linear_fit. I'll put it into my current PR.

jpivarski commented 4 years ago

It's now linear_fit and mask_identity: https://github.com/scikit-hep/awkward-1.0/pull/195/commits/43fe180508e6a4addfa596338184a235b95ec1ce (or it will be when scikit-hep/awkward-1.0#195 is done).