Closed kalafut closed 4 years ago
Instead of hard coding the normalization for individual repository providers, it would probably be better to have a more generic solution which would work with any provider. I think it should suffice to extract hostname and path from both ssh+git and https URLs.
The same normalization should also be used for local_path
(which seems broken for ssh+git URLs btw.). But this can be left for another PR.
I started that way but was a little concerned about the git@...
portion, and whether that should be a distinguishing characteristic. i.e. whether git@github.com:foo/bar
should equal fred@github.com:foo/bar
. It sounds like that's not a concern, in which case I can definitely simplify this.
@straight-shoota I've generalized this comparison. As it turns out there are actually a lot of allowed git URL schemes. The regexes should take care of the common ones, and exactly matching origins will always work (i.e. current behavior).
Could we use URI.parse
instead of the regex?
@Sija @RX14 Thanks for the review! URI.parse
does not handle the SSH URIs that are in the scp style (e.g. git@github.com:example/project.git
), so something additional was needed. The pair of regexes was compact, but I don't really like adding regexes either if I can avoid it.
I'd also missed @straight-shoota's comment about local_path
being broken too for these patterns. As a result, I've redone things a bit to make a small git URI parser that uses URI.parse
but can also deal with some other formats, while still returning a URI object for consistency. It still feels a little fragile, but the code is simple and there are no regex's, so it's probably better overall. It can be used for origin_changed?
and local_path
.
I need to add some other tests but will then push it up for you to look at.
I've substantially updated the PR based on feedback, and realizing that local_path
is also sort of broken. No more ugly regexes either.
The implementation of parse_git_uri
could be much simpler:
private def parse_git_uri(raw_uri)
uri = URI.parse(raw_uri)
return uri if uri.absolute?
URI.parse("ssh:#{raw_uri}")
end
Using .absolute
is a good change. I don't see how the recommendation handles the main case of git+ssh urls, however. That would just leave everything in the path. e.g. https://play.crystal-lang.org/#/r/8grl
Yeah, you're right.
Looks like parse + absolute
can get confused too:
uri = URI.parse("github.com:foo/bar") # scheme => github.com, host => nil, path => foo/bar
p uri.absolute? # => true
But that's only If the user (git@...
) isn't included. While that style is accepted by git, it would seem to be a pretty unlikely real world case. I'd still be OK with absolute
and not handling those user-less ssh URLs.
I suppose shards should allow the exact same formats as accepted by git, so user-less scp-like format example.com:foo/bar
should be valid. It seems like git treats opaque URLs as scp-like format, so expanding the condition to uri.absolute? && !uri.opaque?
should be sufficient.
Yep that seems to work OK.
All comments addressed.
Thanks!
I need an
insteadOf
git configuration and this issue has been really slowing things down, so I decided to make an attempt at fixing it. This is literally my first pushed Crystal code so any feedback is welcome!Fixes #288