klmr / box

Write reusable, composable and modular R code
https://klmr.me/box/
MIT License
829 stars 47 forks source link

New feature: unscoped module names #316

Open klmr opened 1 year ago

klmr commented 1 year ago

This PR is a WIP implementation of unscoped module names. For now, the syntax for specifying an unscoped module name xyz is box::use(mod(xyz)). mod(…) disambiguates between package and module names, and can be passed any module name, including scoped ones (both global and local, e.g. mod(foo/bar) and mod(./foo)).

The current placeholder syntax for disambiguation (i.e. mod(…)) is subject to change before the PR is merged. For suggestions, see discussion at #307.

For now I’ve narrowed this list down to the following candidates:

  1. box::use(mod:module_name)
  2. box::use(mod::module_name)

Feedback welcome.

king-of-poppk commented 11 months ago

Why not the other way around, force an explicit package(...) for R packages, and make modules first class?

klmr commented 11 months ago

@king-of-poppk Because that would be a breaking change, and because (for better or worse) packages remain the primary means by which people reuse code in R. Making them harder to use would create a barrier for the adoption of ‘box’.

Furthermore, modules are already first class in ‘box’. It’s just that module names are intended to be namespaced, i.e. prefixed with an organisation or user name, same as e.g. projects on GitHub. It’s only non-namespaced modules, which should be rare (and are actively discouraged) that require disambiguation.

mschubert commented 2 months ago

Feedback welcome.

In addition to the options listed here, git provides another option that is valid R syntax: :/file

Wouldn't that be a good solution?

king-of-poppk commented 2 months ago

It’s only non-namespaced modules, which should be rare (and are actively discouraged) that require disambiguation.

Is it possible that this is post-hoc justification? Is it possible that this choice was driven mainly by the ambiguity that arises from the existence of packages (which are not namespaced)? One could have very well made the choice of making ALL modules first-class from the start, namespaced or not, and require additional syntax for packages.

That said, I would be very fond of being explicit in both cases, i.e.: box::use(module(x)) and box::use(package(y)).

klmr commented 2 months ago

@mschubert

Wouldn't that be a good solution?

Well but it’s not valid R:

› quote(:/file)
Error: unexpected ':' in "quote(:"

😉

klmr commented 2 months ago

@king-of-poppk

Good question. In reality both of these reasons played a role in the decision. But I do think that having non-namespaced module names is problematic for several reasons. It even has security implications (it makes typosquatting much easier — this is a real problem that is dogging PyPI and npm).

There was a very lively debate on this topic in the Rust ecosystem (e.g. here and here). Rust/Cargo ultimately decided to go for non-namespaced names. However, if you actually read the debate, you will be confused by that decision: the arguments in its favour are strange, to say the least: generally weak, and some of them are simply nonsense (“it encourages creativity in package naming” … uh?!). The arguments against it were much stronger, and many people came away convinced that the wrong decision had been made. This debate strongly influenced the ‘box’ decision.

If I could design ‘box’ from scratch I would still make the same decision: explicitly namespaced modules have tangible advantages and few real drawbacks. That’s why I’m also still not sure about the usefulness and/or potential for harm of this PR.

That said, I would be very fond of being explicit in both cases, i.e.: box::use(module(x)) and box::use(package(y))

I find this way too verbose to be useful in practice: the signal-to-noise ratio of these declarations is too low.