Closed stakach closed 3 years ago
And I noticed the PR after mi comment in the issue
Would it be enough to normalize when github:
is used, or should it also be lowercased when git: https://...
is used?
If so, we should not transform user/password to avoid breakinig https://<token>@github.com/owner/repo.git
or https://<token>:x-oauth-basic@github.com/owner/repo.git
private repos.
Could you add some tests git_resolver_spec.cr
? 🙈
@bcardiff added code to deal with generic git: urls
when they match github, bitbucket or gitlab
Just my two cents, but maybe it would have been better to fix this issue just for github. The original PR was a one line change.
@asterite I would have agreed with you before implementing @bcardiff suggestion it feels more complete / robust this way
@Sija we're only normalising github, bitbucket and gitlab paths, these services and their behaviour are well known and the path is being normalised to the one they recommend when you click the clone button on their webpage.
Not only will this improve performance, as a HTTP to HTTPS redirect will occur if you are cloning from the service it also improves security, preventing man in the middle in case someone entered HTTP instead of HTTPS (thinking of the recent SolarWinds attack where a build service was hijacked)
[...] it also improves security, preventing man in the middle in case someone entered HTTP instead of HTTPS (thinking of the recent SolarWinds attack where a build service was hijacked)
git
resolver, not the known provider specific ones.I personally think it's fine as we're only fixing up well known services. That said, I don't particularly care either way as I assume most people will always use the specific service resolver, at least based on what I've seen in the community
@stakach I'm also confused about the SolarWinds remarks - what does that have to do with any of this?
https://www.zdnet.com/article/third-malware-strain-discovered-in-solarwinds-supply-chain-attack/
For example, lets say you have a popular app that might be a target of hackers
was using the solarwinds example as a reference to show that this kind of attack does happen, getting around code signing
- I add a HTTP github URL in my shard.yml for an application I'm building
How would you do it, if the copy-pasta from every website you mentioned uses https
? Would you remove the s
by hand? How else would you end up with http
protocol?
- they edit the servers hosts file so they can inject whatever code they want without me knowing, as long as the builds succeed
What's the difference here between http
and https
? Why wouldn't the attacker use a https
endpoint with Let's Encrypt certificate for instance?
I think treating git
URLs with well-known hosts the same way as the respective named resolver is good. That automatically removes any potential issues with non-matching sources. If some shard adds a dependency with git:
and a github URL we don't want that to conflict with some other shard referencing the same dependdency with github:
plus path. They should just be equivalent. That means if github.com resolves different request URLs to the same repository, shards should normalize them. I don't think there's a downside to that.
shardbox.org normalizes resolvers that way, too: https://github.com/shardbox/shardbox-core/blob/19fb582433bbd914ba31295c76cf411d322c2a71/src/repo/ref.cr#L22 (there's no downcasing the path, but it still works because the DB field is case-insensitive; should probably normalize that explicitly, though).
To address security concerns with unencrypted connections, we could consider to make http
protocol an error for services known to only serve on HTTPS. But that's really a different topic.
@Sija
What's the difference here between http and https? Why wouldn't the attacker use a https endpoint with Let's Encrypt certificate for instance?
Let's Encrypt will only generate a certificate for a domain if you can prove you control it, they just make the proving process super simple. Because of this proof requirement HTTPS prevents man in the middle attacks and is the main reason the HSTS header exists and is present on all the major repository hosting platforms
I think I would leave out git://
from the normalization. There are some private repo that use git: git@github.owner/repo.git
sources to use the SSH identity directly. Using https will force them to do workarounds like https://stackoverflow.com/a/64587959/30948 which seems more indirect than using the SSH identity.
I guess that using github: or git: with http[s] is the most common thing. @straight-shoota can you confirm from shardbox that? If we leave out the git://
from the normalization the case sensitivity issues on those pattern will still be solved and it will not break existing workflows with private shards.
Other than that the PR looks good to me. I think it's fine to rewrite http to https for these known hosts.
@stakach don't you have private shards that would be affected by this change for example?
cc: @bararchy
@bcardiff the git
protocol itself doesn't support authentication at all.
So no private repos will work using git
and git protocol requires the user to verify the server identity, unlike HTTPS where this is automated
The accepted answer in that stackoverflow post mentions:
git:// URLs are read only, and it looks like private repos do not allow this form of access.
to use ssh with authentication you need to specify the URL in the form of:
git@github.com
or ssh://git@github.com/
both of which will continue to work
I guess that using github: or git: with http[s] is the most common thing. @straight-shoota can you confirm from shardbox that?
Yes. All dependencies known to shardbox with git:
source actually use https scheme. Not even a single http URL.
fixes #470 #275