matloff / partools

Tools to aid coding in the R 'parallel' package.
40 stars 11 forks source link

about naming #14

Open dracodoc opened 7 years ago

dracodoc commented 7 years ago

The points I raised here could be just personal taste, and it might be quite cumbersome to change names, but I think it's better discussed earlier than later.

I found some names in packages a little bit confusing:

dracodoc commented 7 years ago

I just found the vignettes already mentioned that sometimes you need more than averaging. This confirmed my idea that the ca name is not best. And I found scatter have some random shuffle meaning inherent so it's a good word for this case.

clarkfitzg commented 7 years ago

Agree that the names could be improved.

I'll suggest to use underscore, and use common prefix like stringr. So all functions will be like filexx, dis, sa_xx, or even just f_xx, d_xx.

To be clear, does this mean ca,cabase,calm,caglm,caprcomp become ca_, ca_base, ca_lm, ca_glm, ca_prcomp, etc.?

dracodoc commented 7 years ago

Yes, I didn't add the 'ca' example because I think ca is not the best representation of software alchemy. "Software alchemy" is not easy to understand or relate either.

matloff commented 7 years ago

Changing from 'ca' to 'sa' is a good idea. We can do that easily without breaking users' old partools code by simple assignments, e.g. salm <- calm.

I agree that the lack of separators like '_' may be difficult for a non-native speaker of English at first, but I would be reluctant to break users' existing code.

Software alchemy is really for means, including proportions, and is not appropriate for something like fetching the top 10 values of a variable. However, one can use partools in other ways. Actually, I was just the other day thinking about writing a convenience function for that.

As to Divide and Combine, see my 2016 JSS paper, which is referenced both in the man page and the vignette.