Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

Ecosystem name resolution security problem #229

Open JJ opened 4 years ago

JJ commented 4 years ago

This isue was raised because, all of a sudden, documentation dependencies included GTK::Simple, a problem raised in this issue. The issue was solved by simply eliminating Pod::To::HTML from the dependencies. But then, investigating this issue, I arrived to this in Pod::to::HTML which states that it no longer can be installed properly unless you qualify it with an auth meta. Again, by issuing a zef search over this module, I found:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ID|From                             |Package                                             |Description                                                                                    
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 |Zef::Repository::Ecosystems<p6c> |PodCache::Module:ver<0.3.1>:auth<Richard Hainsworth>|Render pod files from a cache.                                                                 
1 |Zef::Repository::Ecosystems<p6c> |Raku::Pod::Render:ver<3.1.0>:auth<github:finanalyst>|A generic Pod Renderer for single or multiple files using templates, provides HTML and MarkDown
2 |Zef::Repository::Ecosystems<p6c> |Pod::To::HTML:ver<0.7.1>:auth<github:Raku>          |Convert Raku Pod to HTML                                                                       
3 |Zef::Repository::LocalCache      |Raku::Pod::Render:ver<3.1.0>:auth<github:finanalyst>|A generic Pod Renderer for single or multiple files using templates, provides HTML and MarkDown
4 |Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.16>                           |Convert Perl 6 Pod to HTML                                                                     
5 |Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.17>                           |Convert Perl 6 Pod to HTML                                                                     
6 |Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.6.0>:auth<Perl 6>               |Convert Perl 6 Pod to HTML                                                                     
7 |Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.6.1>:auth<Perl 6>               |Convert Perl 6 Pod to HTML                                                                     
8 |Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.22>                           |Convert Perl 6 Pod to HTML                                                                     
9 |Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.7>                            |Convert Perl 6 Pod to HTML                                                                     
10|Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.15>                           |Convert Perl 6 Pod to HTML                                                                     
11|Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.13>                           |Convert Perl 6 Pod to HTML                                                                     
12|Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.7.0>:auth<github:Raku>          |Convert Raku Pod to HTML                                                                       
13|Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.4.0>:auth<Perl 6>               |Convert Perl 6 Pod to HTML                                                                     
14|Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.6.2>:auth<Perl 6>               |Convert Perl 6 Pod to HTML                                                                     
15|Zef::Repository::LocalCache      |Pod::To::HTML:ver<0.3.15>                           |Convert Perl 6 Pod to HTML                                                                     
16|Zef::Repository::Ecosystems<cpan>|Pod::To::HTMLBody:ver<0.0.1>:auth<github:drforr>    |HTML body                                                                                      
17|Zef::Repository::Ecosystems<cpan>|Pod::To::HTML::Section:ver<0.1.0>:api<0>            |Convert a Pod6 document to HTML            

Which includes Raku::Pod::Render (there's also this issue requesting elimination of that dependency). Anyway, the module definition is properly qualified with a meta, although not in META6.json

The straightforward solution is obviously to try and qualify provides or dependencies in both ends, but in that case this would not be a problem-solving kind of problem. The fact is that, before finding out the solution, it was relatively easy to "grab" a particular name just by putting it in the "provides" section of a module. So the question is should we maybe add a check for this kidn of thing when new distributions are added to the ecosystem?

At least in the ones we can control, that is, not those uploaded to CPAN.

niner commented 4 years ago

I don't understand what the problem is that needs solving. I take it multiple modules in the ecosystem share a short-name. That's something we explicitly allow and tout as a feature. It's also why you should qualify your dependencies. So what's to solve?

JJ commented 4 years ago

El dom., 6 sept. 2020 a las 11:53, niner (notifications@github.com) escribió:

I don't understand what the problem is that needs solving. I take it multiple modules in the ecosystem share a short-name. That's something we explicitly allow and tout as a feature. It's also why you should qualify your dependencies. So what's to solve?

Can you please take a look at the links I've provided? In short, let's suppose you have something like Pod::To::HTML. Somebody else creates another module which "provides" this same module. If there's no name qualification, the new module might be installed (along with the rest of the distribution) instead of the "old" Pod::To::HTML. I get what you say about the need to qualify dependencies. But if that's not the case (as has happened above) there should be some kind of warning, or even outright error, because if there isn't, it might imply anything from a waste of time (as happened above) to an outright big gaping security hole.

As a matter of fact, I remember something similar happened with NativeHelpers? NativeLibs? You can run an experiment if you want with any other module out there, pick one, Foo::Bar, create another one, Bar::Foo which provides Foo::Bar:auth, there's no guarantee the first Foo::Bar will have any kind of precedence unless you explicitly pick it up via its auth field (or whatever).

This is a feature, and a good feature at that. But unless we a) warn everyone to qualify their modules, b) discourage explicitly to use modules without an auth field, or even c) try to solve that ambiguity in some way, via a warning saying something like "You want to install Foo::Bar, of which there are two copies. which resolves to Foo::Bar:auth. If that's not your intention, please qualify Foo::Bar", or an outright error, that might be a problem, and blaming the user for not qualifying a dependency is not going to solve it.

JJ commented 4 years ago

Additionally, please check also this issue. By simply providing a module with the same name as an existing distribution, and apparently a higher version, you can "hide" a module from the ecosystem. So if you want to install Pod::To::HTML, which appears in the ecosystem with no qualification at all, you get Raku::Pod::Render instead, just because one of the modules provided by that one has the same name and a higher version. I'm not totally sure provides allows qualification by auth or other fields. My proposal in this case would be to check for existing names and disallow modules inside a distribution with the same name (and no qualification) as an existing one. Of course, and again, we can't do that in CPAN, so that would have to be solved at the zef or any module installation level (which might want to check anyway in case of ambiguity).

niner commented 4 years ago

I have of course looked at all the links in the original post and was still not sure what the aim of this ticket is.

provides doesn't have to allow qualification as the whole dist is qualified by those extra fields.

Disallowing uploads of modules with the same short name (that's what it is - just the short name) as a module in a different dist would negate the whole point of having fully qualified names in the first place.

If you require a certain implementation of a module, simply tell Raku so by using a fully qualified name, instead of relying on the runtime to guess what you really mean. If you don't want that and leave the choice to the user and some other author uploads an incompatible module with the same short name - communicate. Tell that author that there's a conflict and kindly ask them to pick a different name.

codesections commented 4 years ago

If you don't want that and leave the choice to the user and some other author uploads an incompatible module with the same short name - communicate. Tell that author that there's a conflict and kindly ask them to pick a different name.

I'm concerned that you're focusing too much on the potential for accidental name collisions between people acting in good faith and not enough on the security risks from malicious actors.

If I understand the situations @JJ described, it is currently possible for a malicious actor to cause a module from their distribution to be installed instead of the one the user intended to install for any user who specifies dependencies via a shortname. In the case of a non-intentional name clash, that would likely cause the build to fail (as it did here), but if the replacement module was maliciously crafted, it could expose the same API but also execute malicious code.

This problem seems especially acute because the vast majority of packages upload both their META6.json file and their CI config files – so an attacker would not need to guess at shortnames that packages might be using but could search explicitly for packages their target used without a fully qualified name. And, given the prevalence of CI and clean builds, it is very common for users to install packages from the network multiple times rather than rely on previously-installed local versions.

Unless I am misunderstanding something pretty significantly, this strikes me as an extremely large security issue and would basically mean that packages in production (including all transitive dependencies) should never depend on a package without specifying the fully qualified name.

JJ commented 4 years ago

Even if distributions are qualified by name, there's no additional authentication/authorization procedure. All it would take, far as I can see it, is providing an auth<whoever> with a high enough version to "hide" the "legitimate" one. And modules.raku.org does not provide auth fields, or instructs to use them. In this case, anyone trying to install pod::to::Html according to instructions would have received a different module altogether. (As the issue I have repeatedly linked indicates)

codesections commented 4 years ago

All it would take, far as I can see it, is providing an auth with a high enough version to "hide" the "legitimate" one.

Well, but the auth field is guaranteed to be globally unique, right? So, if I install Pod::To::HTML:auth<github:Raku>, I don't have any guarantee about what version I'll get, but I at least know that I'll get on from someone who controls the Raku GitHub account – which is a lot more than I have without that field.

JJ commented 4 years ago

All it would take, far as I can see it, is providing an auth with a high enough version to "hide" the "legitimate" one.

Well, but the auth field is guaranteed to be globally unique, right? So, if I install Pod::To::HTML:auth<github:Raku>, I don't have any guarantee about what version I'll get, but I at least know that I'll get on from someone who controls the Raku GitHub account – which is a lot more than I have without that field.

That might be the case if zef does check for that. I have no guarantee it's done at the ecosystem level, and I think that all zef does anyway is to check the auth field in META6.json. IARC, there are some distributions adopted as Raku community modules that keep their old auth, at least for some time. The fact that the auth field is used, and read, seems to imply that's what's used, and not distribution metadata (github or CPAN Nick)

JJ commented 4 years ago

This seems to be the name resolution logic in zef:

https://github.com/ugexe/zef/blob/9db010a4f32c4af866832c3979d0fcf5211a64a6/lib/Zef/Install.pm6#L12

moritz commented 4 years ago

As far as I can tell, there is no logic that prevents me from uploading any distribution with, say, :auth<github:jnthn>, so either that needs to change, or we need to be really clear that :auth is not security related, and we need a different solution to prevent name squatting.

JJ commented 4 years ago

Since there seems to be a (rough) agreement that this is indeed a problem, I'll move on (if no one does it earlier) to propose a solution for it as a document that might be eventually merged here. Let me also draw attention to the fact that whatever is decided could also be a solution to #72 , which wasn't actually solved.

AlexDaniel commented 4 years ago

@JJ I recommend doing and publishing some quick research (or more like data) on the topic. I suggested something on IRC. Basically, how about having a similarly-looking table like here that lists all other languages, links their packaging systems, and maybe describes the most important differences or features in a sentence or two. I think it'd be a wonderful start to get some inspiration (and maybe steal some great ideas!).

JJ commented 3 years ago

This issue has stalled, I guess it's time to either propose a solution or discuss it in the RSC.

2colours commented 1 year ago

Unless I am misunderstanding something pretty significantly, this strikes me as an extremely large security issue and would basically mean that packages in production (including all transitive dependencies) should never depend on a package without specifying the fully qualified name.

That's the thing. Both negative sides have been successfully collected:

By the way, this is another problem that would be much more straightforward to address if the system didn't pretend that dependencies were modules. Dependencies are distributions, in reality. That's the actual bundle of code and metadata that gets shared, and that has one-to-one correspondence with the metadata that dependency resolution can take into account - in other words, those fields used in the "fully qualified name" belong to the distribution, not the module.

After two years, it would be good to see what the actual strategy has become, besides blaming the user.