techascent / tech.ml.dataset

A Clojure high performance data processing system
Eclipse Public License 1.0
659 stars 33 forks source link

`ds/concat*` arglist and docstring improvements #412

Open harold opened 4 weeks ago

harold commented 4 weeks ago

Autocomplete in my editor helpfully offers these suggestions: image

The docstring for ds/concat says this:

Concatenate datasets in place using a copying-concatenation.

Which I find sort of confusing given that the three functions concat, concat-inplace, and concat-copying seem to imply that in place and copying are different.

Improve arglists

Also, the arglist of all three functions is: [dataset & args], which sort of obscures the fact that you're supposed to pass in many arguments each a dataset. Would [& datasets] be better? ... clojure.core/concat has [x y & zs] which I also sort of don't like because it obscures the fact that the arguments are supposed to be sequences.

Improve docstrings

My proposal would be for re-written docstrings on all three of these functions with 1) mention of the input and output return types 2) a short working example of usage, so users don't have to wonder if they need to call apply 3) an indication in all three to which one is the one you probably want (is it ds/concat?) with a short explanation of why you occasionally might want the other two, with, if possible, a short example of their ideal use...

cnuernber commented 6 days ago

I did the absolute minimum possible to make the docstrings clearer - so this is partially addressed at this point - feel free to repoen if you want to have someone tackle this is more depth.

harold commented 6 days ago

https://github.com/techascent/tech.ml.dataset/commit/bd8dacb2b75241791b6695d763d8b0ebc91b3106

Agreed that's an improvement, I still think examples would go a long way. I'll put the good first issue tag on here and some enthusiast will come and do it. (: