crystal-lang / shards

Dependency manager for the Crystal language
Other
765 stars 102 forks source link

Rework of dependency and requirements #354

Closed waj closed 4 years ago

waj commented 4 years ago

This is a refactor of dependency requirements (specifications of which versions are suitable for each dependency).

Currently, each dependency has different fields (version, branch, tag, commit) that declares the requirement. But this model is imprecise (allows invalid states), difficult to reason in code (version is sometimes a pattern, others an exact version) and requires constant parsing of the version field (for example to differentiate patterns, or exact version).

With this change, each Dependency now has only three fields: name, resolver and requirement. Also note that the relationship between Dependency and Resolver is now inverted. (In a future refactor I also plan to remove the name from Resolver making it totally independent from the dependency that it pretends to install).

Requirement is actually a union of these types (structs):

Currently we only have only one resolver (git) that defines subclasses of Ref:

The parsing of Ref is delegated to the resolver. Also, printing of exact versions is delegated to resolvers, thus producing better output without introducing ad-hoc code.

straight-shoota commented 4 years ago

The internal refactoring looks nice.

But I'm hesitant about the changes to parsing dependencies. They violate quite a few properties that have been established in #306 and previous discussions. For example, it was agreed that extraneaous attributes in dependency mappings should be ignored to enable forward compatibility. And that YAML parsing should be isolated and not depend on resolvers. Requiring the resolver url to be the first entry in the dependency mapping is also a huge breaking change. I can't see how we could possibly accept that. The SPEC does not define a specific order of keys, so the implementation must not assume any. Now this breaks with all of that, and I don't see any immediate benefit wich would justify deriving from the previous decisions. Maybe you can shed some light on that? Is it necessary to break behaviour and why?

I've looked at the dependency mappings available on shardbox.org to get some stats: There are 720 dependencies (about 11%) where the resolver is not the first k/v pair. These would be broken with this change. When taking only dependencies defined by latest releases into account, the number is still 122 broken dependencies (but that's even 20%).

Additional, undefined properties on the other hand are less frequent. There are only 35 instances (about 0,5 %) and they all seem to be misspellings of version. So in this regard, a breaking change would be acceptable. But it would break forward compatibility in case we intend to introduce additional properties at some point.

The stats only take dependencies into account. I didn't look at dev_dependencies as I don't think additional data is necessary.

waj commented 4 years ago

Big oops! Thank you @straight-shoota for being right there. I thought it could be acceptable to break compatibility because fixing the yml would be an easy thing, but that would also break the parsing of previous versions which is not acceptable. I don't know what I was thinking... sorry and thanks.

I'll change the parsing to keep it compatible. Probably by parsing each dependency attributes to a hash, figure out the resolver type and pass that hash to the resolver to define the requirement.

Now, what do we win by allowing the resolver key to not be the first entry in each map? Maybe it's late to change that, but I think it improves readability of the spec.yml.

RX14 commented 4 years ago

I just want to say having shardbox to give statistics about these kinds of changes has been amazing and affected real data-driven change to shards. Thank you so much @straight-shoota!

waj commented 4 years ago

Absolutely, shardbox is awesome!

I just pushed the changes that revert the (in)compatibility of the parsing.

I also replaced the Dependency.new(pull) with Dependency.from_yaml because I didn't like that little game with tuples to workaround issues with nilables.

bcardiff commented 4 years ago

I've just noticed the lock file generated after this refactor as follows. Which is semantically fine, but just wondering if the change is intended or not.

version: 1.0
shards:
  molinillo:
---    github: crystal-lang/crystal-molinillo
+++    git: https://github.com/crystal-lang/crystal-molinillo.git
    version: 0.1.0
straight-shoota commented 4 years ago

@bcardiff I guess it doesn't matter much either way. Using explicit git is a bit more verbose, but might have a slight advantage towards forward compatibility. If we add more shortcut keys, older shards version would still be able to install from the lock file.

bcardiff commented 4 years ago

Yes, I agree it can still work and it might even be a good choice. But since it wasn't stated as an intended change, I wanted to double-check.

waj commented 4 years ago

Actually that changed since #306, not my fault 😇

I think it's ok, if we think the lock file is saving a normalized definition of the dependencies

straight-shoota commented 4 years ago

I would have appreciated to get a chance at properly reviewing the updated changes :/