Manishearth / namespacing-rfc

RFC for Packages as Optional Namespaces
47 stars 3 forks source link

Decision: Separator mapping #2

Closed Manishearth closed 2 years ago

Manishearth commented 4 years ago

Related: https://github.com/Manishearth/namespacing-rfc/issues/1

The current RFC maps foo/bar to foo_bar in Rust code.

There are alternate mappings that can be selected

The proposal suggests mapping foo/bar to foo_bar, but as mentioned in the typosquatting section, this has problems. There may be other mappings that work out better:

  • foo::bar (see section above)
  • foo::crate::bar
  • foo::/bar
  • ~foo::bar

and the like.

Manishearth commented 4 years ago

I am personally a huge fan of mapping to _ because it does not require any additional changes to rustc and makes this a purely crates/cargo-side feature

carols10cents commented 4 years ago

The typosquatting section referenced:

Dash typosquatting

This proposal does not prevent anyone from taking foo-bar after you publish foo/bar. Given that the Rust crate import syntax for foo/bar is foo_bar, same as foo-bar, it's totally possible for a user to accidentally type foo-bar in Cargo.toml instead of foo/bar, and pull in the wrong, squatted, crate.

We currently prevent foo-bar and foo_bar from existing at the same time. We could do this here as well, but it would only go in one direction: if foo/bar exists foo-bar/foo_bar cannot be published, but not vice versa. This limits the "damage" to cases where someone pre-squats foo-bar before you publish foo/bar, and the damage can be mitigated by checking to see if such a clashing crate exists when publishing, if you actually care about this attack vector. There are some tradeoffs there that we would have to explore.

One thing that could mitigate foo/bar mapping to the potentially ambiguous foo_bar is using something like foo::crate::bar or ~foo::bar or foo::/bar in the import syntax.

I was confused about the multiple negatives and slashes in this sentence, especially when I lost the markdown formatting the first time I copied it (I've fixed it now):

if foo/bar exists foo-bar/foo_bar cannot be published, but not vice versa.

To clarify, I would remove the / and spell out the "vice versa" and the reasoning so that this says:

- if `foo/bar` exists `foo-bar`/`foo_bar` cannot be published, but not vice versa.
+ if `foo/bar` exists, neither `foo-bar` nor `foo_bar` will be allowed to be published. However, 
+ if `foo-bar` or `foo_bar` exist, we would choose to allow `foo/bar` to be 
+ published, because we don't want to limit the use of names within a crate namespace 
+ due to crates outside the namespace existing.

Have I understood correctly?

To put this another way, this would mean crates.io could have the crates serde_gelf and serde/gelf (if serde_gelf exists first, which it would). If for some reason you wanted to depend on both, your Cargo.toml would contain:

[dependencies]
serde_gelf = "0.1.6"
"serde/gelf" = "1.0.0"

and to bring items from each of these into scope, your code would need:

use serde_gelf::stuff; // brings in from `serde_gelf`
use serde_gelf as serde_somethingelse_gelf; // need to choose a new name for one of them
use serde_somethingelse_gelf::other_stuff; // can then bring more names into scope with the alias

Seems... non-ambiguous to the compiler but potentially a little annoying for the programmer, but probably not common?

Manishearth commented 4 years ago

Have I understood correctly?

Correct! I've updated the RFC. The intent is to make sure that foo- does not become a de facto namespace.

use serde_gelf as serde_somethingelse_gelf; // need to choose a new name for one of them

This would have to be renamed in Cargo.toml, otherwise both would map as serde_gelf and Cargo would have to throw an error. Cargo lets you do this already.

(Worth noting, it's already possible to publish multiple crates that have the same lib name, so Cargo needs to deal with this already)

Seems... non-ambiguous to the compiler but potentially a little annoying for the programmer, but probably not common?

Correct, however someone could maliciously do this to confuse people, e.g. by publishing serde-gelf containing a bitcoin miner, whereas serde/gelf is the Actual Good crate.

carols10cents commented 4 years ago

This would have to be renamed in Cargo.toml

Ahhh right.

Correct, however someone could maliciously do this to confuse people, e.g. by publishing serde-gelf containing a bitcoin miner, whereas serde/gelf is the Actual Good crate.

LOL it's really difficult to consider this and #1 and #3 separately... hmmm....

Manishearth commented 4 years ago

Yeah to be clear discussion for that should probably belong on #3, but we might need to merge things.

pksunkara commented 4 years ago

Keeping in mind that we also solve the problem of typosquatting mentioned in #3, I have a proposal:

Let's make the mapping __ (double underscore). So, foo/bar would map to foo__bar in Rust code.

use foo::bar; // mod `bar` of crate `foo`
use foo_bar; // crate `foo_bar`/`foo-bar`
use foo__bar; // crate `foo/bar`

// We can even expand
use actix__web__middleware__cors; // crate `actix/web/middleware/cors`

// We can even combine
use foo__bar_baz; // crate `foo/bar-baz`

There is a relevant issue that proposes the banning of --/__ in crate name which we could implement first while we work on namespacing.

These are the only crates that currently use either of them:

$ rg --files -l -g "*__*" .
./te/ar/teardown_tree___treap
./ir/ef/iref__bbqsrc
./a4/2-/a42-__-
$ rg --files -l -g "*--*" .
./xn/--/xn--ls8h
./co/ns/const_env_impl--value
./co/ns/const_env--value
pitaj commented 2 years ago

I like the double underscore idea, since it maps to the same number of characters.

I think it provides a clear path forward:

This encapsulates the cargo side from the rustc side and allows for an easy upgrade path.

Manishearth commented 2 years ago

Fixed in https://github.com/Manishearth/rfcs/commit/1986c3dac3bfbe791aa6c0efdb4525a28ec08e6e, we have switched to ::