Open JJ opened 4 years ago
I don't understand what the problem is that needs solving. I take it multiple modules in the ecosystem share a short-name. That's something we explicitly allow and tout as a feature. It's also why you should qualify your dependencies. So what's to solve?
El dom., 6 sept. 2020 a las 11:53, niner (notifications@github.com) escribió:
I don't understand what the problem is that needs solving. I take it multiple modules in the ecosystem share a short-name. That's something we explicitly allow and tout as a feature. It's also why you should qualify your dependencies. So what's to solve?
Can you please take a look at the links I've provided? In short, let's suppose you have something like Pod::To::HTML. Somebody else creates another module which "provides" this same module. If there's no name qualification, the new module might be installed (along with the rest of the distribution) instead of the "old" Pod::To::HTML. I get what you say about the need to qualify dependencies. But if that's not the case (as has happened above) there should be some kind of warning, or even outright error, because if there isn't, it might imply anything from a waste of time (as happened above) to an outright big gaping security hole.
As a matter of fact, I remember something similar happened with
NativeHelpers? NativeLibs? You can run an experiment if you want with any
other module out there, pick one, Foo::Bar, create another one, Bar::Foo
which provides Foo::Bar:auth
This is a feature, and a good feature at that. But unless we a) warn
everyone to qualify their modules, b) discourage explicitly to use
modules without an auth field, or even c) try to solve that ambiguity in some way, via a warning saying something like
"You want to install Foo::Bar, of which there are two copies. which resolves to Foo::Bar:auth
Additionally, please check also this issue. By simply providing a module with the same name as an existing distribution, and apparently a higher version, you can "hide" a module from the ecosystem. So if you want to install Pod::To::HTML, which appears in the ecosystem with no qualification at all, you get Raku::Pod::Render instead, just because one of the modules provided by that one has the same name and a higher version.
I'm not totally sure provides
allows qualification by auth or other fields. My proposal in this case would be to check for existing names and disallow modules inside a distribution with the same name (and no qualification) as an existing one. Of course, and again, we can't do that in CPAN, so that would have to be solved at the zef
or any module installation level (which might want to check anyway in case of ambiguity).
I have of course looked at all the links in the original post and was still not sure what the aim of this ticket is.
provides
doesn't have to allow qualification as the whole dist is qualified by those extra fields.
Disallowing uploads of modules with the same short name (that's what it is - just the short name) as a module in a different dist would negate the whole point of having fully qualified names in the first place.
If you require a certain implementation of a module, simply tell Raku so by using a fully qualified name, instead of relying on the runtime to guess what you really mean. If you don't want that and leave the choice to the user and some other author uploads an incompatible module with the same short name - communicate. Tell that author that there's a conflict and kindly ask them to pick a different name.
If you don't want that and leave the choice to the user and some other author uploads an incompatible module with the same short name - communicate. Tell that author that there's a conflict and kindly ask them to pick a different name.
I'm concerned that you're focusing too much on the potential for accidental name collisions between people acting in good faith and not enough on the security risks from malicious actors.
If I understand the situations @JJ described, it is currently possible for a malicious actor to cause a module from their distribution to be installed instead of the one the user intended to install for any user who specifies dependencies via a shortname. In the case of a non-intentional name clash, that would likely cause the build to fail (as it did here), but if the replacement module was maliciously crafted, it could expose the same API but also execute malicious code.
This problem seems especially acute because the vast majority of packages upload both their META6.json file and their CI config files – so an attacker would not need to guess at shortnames that packages might be using but could search explicitly for packages their target used without a fully qualified name. And, given the prevalence of CI and clean builds, it is very common for users to install packages from the network multiple times rather than rely on previously-installed local versions.
Unless I am misunderstanding something pretty significantly, this strikes me as an extremely large security issue and would basically mean that packages in production (including all transitive dependencies) should never depend on a package without specifying the fully qualified name.
Even if distributions are qualified by name, there's no additional authentication/authorization procedure. All it would take, far as I can see it, is providing an auth<whoever>
with a high enough version to "hide" the "legitimate" one.
And modules.raku.org does not provide auth fields, or instructs to use them. In this case, anyone trying to install pod::to::Html according to instructions would have received a different module altogether. (As the issue I have repeatedly linked indicates)
All it would take, far as I can see it, is providing an auth with a high enough version to "hide" the "legitimate" one.
Well, but the auth
field is guaranteed to be globally unique, right? So, if I install Pod::To::HTML:auth<github:Raku>
, I don't have any guarantee about what version I'll get, but I at least know that I'll get on from someone who controls the Raku
GitHub account – which is a lot more than I have without that field.
All it would take, far as I can see it, is providing an auth with a high enough version to "hide" the "legitimate" one.
Well, but the
auth
field is guaranteed to be globally unique, right? So, if I installPod::To::HTML:auth<github:Raku>
, I don't have any guarantee about what version I'll get, but I at least know that I'll get on from someone who controls theRaku
GitHub account – which is a lot more than I have without that field.
That might be the case if zef does check for that. I have no guarantee it's done at the ecosystem level, and I think that all zef does anyway is to check the auth field in META6.json. IARC, there are some distributions adopted as Raku community modules that keep their old auth, at least for some time. The fact that the auth field is used, and read, seems to imply that's what's used, and not distribution metadata (github or CPAN Nick)
This seems to be the name resolution logic in zef:
https://github.com/ugexe/zef/blob/9db010a4f32c4af866832c3979d0fcf5211a64a6/lib/Zef/Install.pm6#L12
As far as I can tell, there is no logic that prevents me from uploading any distribution with, say, :auth<github:jnthn>
, so either that needs to change, or we need to be really clear that :auth
is not security related, and we need a different solution to prevent name squatting.
Since there seems to be a (rough) agreement that this is indeed a problem, I'll move on (if no one does it earlier) to propose a solution for it as a document that might be eventually merged here. Let me also draw attention to the fact that whatever is decided could also be a solution to #72 , which wasn't actually solved.
@JJ I recommend doing and publishing some quick research (or more like data) on the topic. I suggested something on IRC. Basically, how about having a similarly-looking table like here that lists all other languages, links their packaging systems, and maybe describes the most important differences or features in a sentence or two. I think it'd be a wonderful start to get some inspiration (and maybe steal some great ideas!).
This issue has stalled, I guess it's time to either propose a solution or discuss it in the RSC.
Unless I am misunderstanding something pretty significantly, this strikes me as an extremely large security issue and would basically mean that packages in production (including all transitive dependencies) should never depend on a package without specifying the fully qualified name.
That's the thing. Both negative sides have been successfully collected:
By the way, this is another problem that would be much more straightforward to address if the system didn't pretend that dependencies were modules. Dependencies are distributions, in reality. That's the actual bundle of code and metadata that gets shared, and that has one-to-one correspondence with the metadata that dependency resolution can take into account - in other words, those fields used in the "fully qualified name" belong to the distribution, not the module.
After two years, it would be good to see what the actual strategy has become, besides blaming the user.
This isue was raised because, all of a sudden, documentation dependencies included GTK::Simple, a problem raised in this issue. The issue was solved by simply eliminating Pod::To::HTML from the dependencies. But then, investigating this issue, I arrived to this in Pod::to::HTML which states that it no longer can be installed properly unless you qualify it with an auth meta. Again, by issuing a
zef search
over this module, I found:Which includes Raku::Pod::Render (there's also this issue requesting elimination of that dependency). Anyway, the module definition is properly qualified with a meta, although not in
META6.json
The straightforward solution is obviously to try and qualify
provides
or dependencies in both ends, but in that case this would not be a problem-solving kind of problem. The fact is that, before finding out the solution, it was relatively easy to "grab" a particular name just by putting it in the "provides" section of a module. So the question is should we maybe add a check for this kidn of thing when new distributions are added to the ecosystem?