crystal-lang / shards

Dependency manager for the Crystal language
Other
765 stars 102 forks source link

Update origin_changed? to ignore protocol differences #315

Closed kalafut closed 4 years ago

kalafut commented 4 years ago

I need an insteadOf git configuration and this issue has been really slowing things down, so I decided to make an attempt at fixing it. This is literally my first pushed Crystal code so any feedback is welcome!

Fixes #288

straight-shoota commented 4 years ago

Instead of hard coding the normalization for individual repository providers, it would probably be better to have a more generic solution which would work with any provider. I think it should suffice to extract hostname and path from both ssh+git and https URLs.

The same normalization should also be used for local_path (which seems broken for ssh+git URLs btw.). But this can be left for another PR.

kalafut commented 4 years ago

I started that way but was a little concerned about the git@... portion, and whether that should be a distinguishing characteristic. i.e. whether git@github.com:foo/bar should equal fred@github.com:foo/bar. It sounds like that's not a concern, in which case I can definitely simplify this.

kalafut commented 4 years ago

@straight-shoota I've generalized this comparison. As it turns out there are actually a lot of allowed git URL schemes. The regexes should take care of the common ones, and exactly matching origins will always work (i.e. current behavior).

RX14 commented 4 years ago

Could we use URI.parse instead of the regex?

kalafut commented 4 years ago

@Sija @RX14 Thanks for the review! URI.parse does not handle the SSH URIs that are in the scp style (e.g. git@github.com:example/project.git), so something additional was needed. The pair of regexes was compact, but I don't really like adding regexes either if I can avoid it.

I'd also missed @straight-shoota's comment about local_path being broken too for these patterns. As a result, I've redone things a bit to make a small git URI parser that uses URI.parse but can also deal with some other formats, while still returning a URI object for consistency. It still feels a little fragile, but the code is simple and there are no regex's, so it's probably better overall. It can be used for origin_changed? and local_path.

I need to add some other tests but will then push it up for you to look at.

kalafut commented 4 years ago

I've substantially updated the PR based on feedback, and realizing that local_path is also sort of broken. No more ugly regexes either.

straight-shoota commented 4 years ago

The implementation of parse_git_uri could be much simpler:

private def parse_git_uri(raw_uri)
  uri = URI.parse(raw_uri)
  return uri if uri.absolute?

  URI.parse("ssh:#{raw_uri}")
end
kalafut commented 4 years ago

Using .absolute is a good change. I don't see how the recommendation handles the main case of git+ssh urls, however. That would just leave everything in the path. e.g. https://play.crystal-lang.org/#/r/8grl

straight-shoota commented 4 years ago

Yeah, you're right.

kalafut commented 4 years ago

Looks like parse + absolute can get confused too:

uri = URI.parse("github.com:foo/bar") # scheme => github.com, host => nil, path => foo/bar
p uri.absolute? # => true

But that's only If the user (git@...) isn't included. While that style is accepted by git, it would seem to be a pretty unlikely real world case. I'd still be OK with absolute and not handling those user-less ssh URLs.

straight-shoota commented 4 years ago

I suppose shards should allow the exact same formats as accepted by git, so user-less scp-like format example.com:foo/bar should be valid. It seems like git treats opaque URLs as scp-like format, so expanding the condition to uri.absolute? && !uri.opaque? should be sufficient.

kalafut commented 4 years ago

Yep that seems to work OK.

kalafut commented 4 years ago

All comments addressed.

ysbaddaden commented 4 years ago

Thanks!