Closed waj closed 4 years ago
The internal refactoring looks nice.
But I'm hesitant about the changes to parsing dependencies. They violate quite a few properties that have been established in #306 and previous discussions. For example, it was agreed that extraneaous attributes in dependency mappings should be ignored to enable forward compatibility. And that YAML parsing should be isolated and not depend on resolvers. Requiring the resolver url to be the first entry in the dependency mapping is also a huge breaking change. I can't see how we could possibly accept that. The SPEC does not define a specific order of keys, so the implementation must not assume any. Now this breaks with all of that, and I don't see any immediate benefit wich would justify deriving from the previous decisions. Maybe you can shed some light on that? Is it necessary to break behaviour and why?
I've looked at the dependency mappings available on shardbox.org to get some stats: There are 720 dependencies (about 11%) where the resolver is not the first k/v pair. These would be broken with this change. When taking only dependencies defined by latest releases into account, the number is still 122 broken dependencies (but that's even 20%).
Additional, undefined properties on the other hand are less frequent. There are only 35 instances (about 0,5 %) and they all seem to be misspellings of version
. So in this regard, a breaking change would be acceptable. But it would break forward compatibility in case we intend to introduce additional properties at some point.
The stats only take dependencies
into account. I didn't look at dev_dependencies
as I don't think additional data is necessary.
Big oops! Thank you @straight-shoota for being right there. I thought it could be acceptable to break compatibility because fixing the yml would be an easy thing, but that would also break the parsing of previous versions which is not acceptable. I don't know what I was thinking... sorry and thanks.
I'll change the parsing to keep it compatible. Probably by parsing each dependency attributes to a hash, figure out the resolver type and pass that hash to the resolver to define the requirement.
Now, what do we win by allowing the resolver key to not be the first entry in each map? Maybe it's late to change that, but I think it improves readability of the spec.yml.
I just want to say having shardbox to give statistics about these kinds of changes has been amazing and affected real data-driven change to shards. Thank you so much @straight-shoota!
Absolutely, shardbox is awesome!
I just pushed the changes that revert the (in)compatibility of the parsing.
I also replaced the Dependency.new(pull)
with Dependency.from_yaml
because I didn't like that little game with tuples to workaround issues with nilables.
I've just noticed the lock file generated after this refactor as follows. Which is semantically fine, but just wondering if the change is intended or not.
version: 1.0
shards:
molinillo:
--- github: crystal-lang/crystal-molinillo
+++ git: https://github.com/crystal-lang/crystal-molinillo.git
version: 0.1.0
@bcardiff I guess it doesn't matter much either way. Using explicit git
is a bit more verbose, but might have a slight advantage towards forward compatibility. If we add more shortcut keys, older shards version would still be able to install from the lock file.
Yes, I agree it can still work and it might even be a good choice. But since it wasn't stated as an intended change, I wanted to double-check.
Actually that changed since #306, not my fault 😇
I think it's ok, if we think the lock file is saving a normalized definition of the dependencies
I would have appreciated to get a chance at properly reviewing the updated changes :/
This is a refactor of dependency requirements (specifications of which versions are suitable for each dependency).
Currently, each dependency has different fields (
version
,branch
,tag
,commit
) that declares the requirement. But this model is imprecise (allows invalid states), difficult to reason in code (version
is sometimes a pattern, others an exact version) and requires constant parsing of theversion
field (for example to differentiate patterns, or exact version).With this change, each
Dependency
now has only three fields:name
,resolver
andrequirement
. Also note that the relationship betweenDependency
andResolver
is now inverted. (In a future refactor I also plan to remove thename
fromResolver
making it totally independent from the dependency that it pretends to install).Requirement
is actually a union of these types (structs):Version
: represents an exact version already resolved. Used in dependencies read from lock files and internally during dependency resolution.VersionReq
: a version pattern. This corresponds to theversion
field entered in theshard.yml
Ref
: an abstract struct representing any reference specific for the resolverAny
: a wildcard, specifying that any version is suitableCurrently we only have only one resolver (git) that defines subclasses of
Ref
:GitBranchRef
: corresponds to thebranch
attribute inshard.yml
GitTagRef
: corresponds to thetag
attribute inshard.yml
GitCommitRef
: corresponds to thecommit
attribute inshard.yml
GitHeadRef
: used internally when there is no version specified and no tagsThe parsing of
Ref
is delegated to the resolver. Also, printing of exact versions is delegated to resolvers, thus producing better output without introducing ad-hoc code.