renovatebot / renovate

Home of the Renovate CLI: Cross-platform Dependency Automation by Mend.io
https://mend.io/renovate
GNU Affero General Public License v3.0
17.73k stars 2.34k forks source link

Initial Haskell support #8187

Open pbrisbin opened 3 years ago

pbrisbin commented 3 years ago

What would you like Renovate to be able to do?

I would like renovate-bot to support Haskell, specially Stackage-based projects using an LTS resolver and extra dependencies from Hackage:

Resolver

An overall configuration point in a Stackage-based Haskell project is the resolver. For example, resolver: lts-major.minor. I would like renovate-bot to find this and suggest an update to (for example) resolver: lts-major.(minor + 1) when it becomes available. There are more resolvers than these LTS versions, but this is our initial use-case.

Hackage

While choosing a resolver defines a strict set of exact versions available, you can make additional dependencies available via extra-deps:

extra-deps:
  - some-package-x.y.z@sha256:sha,size

I would like renovate-bot to compare this to what's available on Hackage and suggest updates accordingly. There are other kinds of dependencies that can be specified here, and the sha256 verification is optional, but this is our initial use-case.

Did you already have any implementation ideas?

Yaml configuration can be declared directly in a file, named stack.yaml by default:

resolver: lts-16.25
extra-deps:
  - bugsnag-haskell-0.0.4.0@sha256:44a48e25058e45c8cc351355f6a601d9b4bc6c9895e61e84b4dcd3366302a26b,7395

(This path can be changed by configuration, so renovate-bot would have to support that)

Or stack.yaml can point to a different file, with slightly different keys:

# stack.yaml
resolver: ./snapshot.yaml
# snapshot.yaml
resolver: lts-16.25
packages:
  - bugsnag-haskell-0.0.4.0@sha256:44a48e25058e45c8cc351355f6a601d9b4bc6c9895e61e84b4dcd3366302a26b,7395

We would need renovate-bot to:

  # snapshot.yaml
- resolver: lts-16.25
+ resolver: lts-16.27
  packages:
-   - bugsnag-haskell-0.0.4.0@sha256:44a48e25058e45c8cc351355f6a601d9b4bc6c9895e61e84b4dcd3366302a26b,7395
+   - bugsnag-haskell-0.0.4.1@sha256:...,...

A changelog can also be scraped from https://hackage.haskell.org/package/{name}-{version}/changelog, and the https://www.stackage.org/lts?_accept=application/json response includes any changed dependencies, for which changelogs can also be scraped. It would be amazing to include these changelog details in the renovate PR. We have a project that does some of this work here.


I'm willing to try and contribute this support, but I'm having trouble navigating the project enough to know where to add all the moving parts. Would you be able to offer some guidance, given the information above?

rarkins commented 3 years ago

Thanks @pbrisbin for the proposal.

Logically here's how we'd approach an implementation:

Datasources

Is there a need to access new APIs to retrieve versions of packages? In this case it seems like Hackage is needed. It would go into lib/datasource/hackage/*. There's a lot of existing datasources which could be used as inspiration. It looks like it would need to support digests too.

Versioning

Does Haskell use its own syntax to define versions or constraints/ranges? I'm seeing "pinned" versions with digests in your examples so maybe our "loose" versioning would work fine for things like 0.0.4.1.

Manager

The minimum you need to implement is a manager (e.g. lib/manager/stackage?) with and extractPackageFile() function. With a YAML parser, should hopefully be simple.

Is there any separate "artifacts" files that need updating, e.g. checksum or lock files?

Changelog

Currently our changelog only supports github and gitlab sources, but it sounds like they are published to the registry itself for Hackage? Do any packages also publish to a changelog file in their github source?

pbrisbin commented 3 years ago

Does Haskell use its own syntax to define versions or constraints/ranges?

It does, called the PVP. In short, it defines a 4-part version A.B.C.D where

In case it's helpful, in my own packages, I set A to something like 0 or 1 and forget about it, then I just treat B.C.D as the typical Major.Minor.Patch of SemVer. This simplifies things for me, and the result still follows the PVP rules.

I'm seeing "pinned" versions with digests

Yes, a stack.yaml is effectively a "freeze file". Even the resolver bit is just an alias to a big set of pinned versions of available packages. So you are meant to put complete and explicit versions in it, when specifying your own in extra-deps. It doesn't support a bounds or ~0.4 or anything like that. The Digests are optional, but good practice.

(Note that this is just one way to specify dependencies in a Haskell project, so building this support in the way I describe would only support Haskell projects that choose this tooling. I'm happy to describe other approaches, if you're interested in understanding the broader space.)

Is there any separate "artifacts" files that need updating, e.g. checksum or lock files?

Ah yes, in fact. There is a stack.yaml.lock. I'm not sure how to update it after changing stack.yaml other than having a Haskell (stack) tool-chain available.

published to the registry itself for Hackage? Do any packages also publish to a changelog file in their github source?

Sort of both:

(Project README's are handled similarly.)

rarkins commented 3 years ago

I think we'll need a new versioning to handle A,B as major. Otherwise, if you use our loose versioning then it would consider changes to C to only be patch. It makes it easier though if you only need to deal with exact versions and not ranges.

If there's a lock file then it means we'd want to support "artifacts" updating like quite a few other managers.

For changelogs, they might "just work" if the majority of packages (a) define their source repo in metadata, and (b) include a changelog file. It doesn't look like we'll need to use the registry's changelogs.

I'd recommend doing this in this order, in separate PRs:

  1. Versioning
  2. Datasource
  3. Extract + Artifacts

Steps 1 & 2 won't really achieve much on their own, but it's easier to do this incrementally than all in one go.

pbrisbin commented 3 years ago

Awesome, thanks. Granular steps make total sense, even if the first few don't actually get us there.

github-actions[bot] commented 2 years ago

Hi there,

You're requesting support for a new package manager. We need to know some basic information about this package manager first. Please copy/paste the new package manager questionnaire, and fill it out in full.

Once the questionnaire is filled out we will evaluate if adding support for this manager is something we want to do.

Good luck,

The Renovate team

pbrisbin commented 2 years ago

NOTE: I'm answering this questionnaire for the feature of specifically stack-related dependencies. If we want to automatically manage version bounds in a non-Stack project, or library making use of such bounds in its *.cabal file, things are more complicated.

New package manager questionnaire

Did you read our documentation on adding a package manager?

Basics

Name of package manager

Stack.

What language does this support?

Haskell.

How popular is this package manager?

Probably >50% of Haskell projects us it.

Does this language have other (competing?) package managers?


Package File Detection

What type of package files and names does it use?

What fileMatch pattern(s) should be used?

Can ${STACK_YAML:-stack.yaml} be used? If not, hard-coded stack.yaml should be sufficient -- let the user configure otherwise.

Is it likely that many users would need to extend this pattern for custom file names?

Possible, but not likely, IMO.

Is the fileMatch pattern likely to get many "false hits" for files that have nothing to do with package management?

No.


Parsing and Extraction

Can package files have "local" links to each other that need to be resolved?

Yes, stack.yaml can optionally use a "custom snapshot" by resolver: ./<other-yaml>.

Is there a reason why package files need to be parsed together (in serial) instead of independently?

I can't think of one.

What format/syntax is the package file in?

How do you suggest parsing the file?

Does the package file structure distinguish between different "types" of dependencies? e.g. production dependencies, dev dependencies, etc?

List all the sources/syntaxes of dependencies that can be extracted

In stack.yaml, resolver: x is either a named (remote) snapshot or path to a custom snapshot. For the case of a named (remote) snapshot, Renovate would not need to dig into specific dependencies but look at x itself and see if it's out of date and update it accordingly. The snapshot is itself the dependency in a way.

In the case of a path to a custom snapshot, you will find in that file a packages: key of the following syntax:

- {name}[-{version}[(@sha256:{sha}]]

- git: {repo}
  commit: {sha}

- ./<path>

stack.yaml (not the custom snapshot) can also contain a key extra-deps, which is a list of the same syntax to overlay over what came from above. These individual items would need to be inspected to be updated based on what they are.

Describe which types of dependencies above are supported and which will be implemented in future


Versioning

What versioning scheme does the package file(s) use?

Snapshots, use two formats:

Non-git packages/extra-deps items follow PVP, which is A.B.C.D where:

Does this versioning scheme support range constraints, e.g. ^1.0.0 or 1.x?

Not in this file. Items are specified fully, even down to checksum!

Is this package manager used for applications, libraries, or both? If both, is there a way to tell which is which?

A library commonly uses version bounds in its project files, for when it's build not by the developers of the project itself and perhaps across a wide range of dependencies. This feature-request is not concerned with such bounds.

An app commonly uses no version bounds in its project file, relying on the fact that it would only be built by the project maintainers itself, in a Stack-managed context, where the set of available dependencies is fixed by the stack.yaml.

I think either of these scenarios can be handled the same by Renovate.

If ranges are supported, are there any cases when Renovate should pin ranges to exact versions if rangeStrategy=auto?

N/A


Lookup

Is a new datasource required? Provide details

Available resolvers can be found on https://www.stackage.org. There are some JSON endpoints, but parsing HTML may be required at points.

Non-git packages/extra-deps would come from https://hackage.haskell.org. Again, there are some JSON endpoints, parsing HTML may be required, or you could download "the index" which is a gzipped tarball of information about al available dependencies. I would avoid this, personally, as it's a ton of data to pull and parse when you only need information about a subset of packages.

Will users need the capability to specify a custom host/registry to look up? Can it be found within the package files, or within other files inside the repository, or would it require Renovate configuration?

No.

Do the package files contain any "constraints" on the parent language (e.g. supports only v3.x of Python) or platform (Linux, Windows, etc) that should be used in the lookup procedure?

There is such a feature in Haskell, but I think Renovate could ignore it for now.

Will users need the ability to configure language or other constraints using Renovate config?

Not that I can think of.


Artifacts

Are lock files or checksum files used? Are they mandatory?

Yes. Not mandatory, but highly recommended.

If so, what tool and exact commands should be used if updating one or more package versions in a dependency file?

stack build --dry-run will recreate stack.yaml.lock after a change to stack.yaml.

If applicable, describe how the tool maintains a cache and if it can be controlled via CLI or env? Do you recommend the cache be kept or disabled/ignored?

stack --work-dir x can control the latter, the former is fixed.

If applicable, what command should be used to generate a lock file from scratch if you already have a package file? This will be used for "lock file maintenance"

Same, stack build --dry-run

Other

Is there anything else to know about this package manager?

More details can be found in previous comments on this Issue.

domenkozar commented 1 year ago

Existing implementation: https://github.com/nomeata/haskell-bounds-bump-action

mihaimaruseac commented 7 months ago

Any update here? I was mistakenly thinking that Haskell support was already implemented :(

ysangkok commented 2 months ago

Since part of this discussion is about adding a Hackage data source, I thought I'd link #31434, which contains such a data source. Since the data source is necessary for both Cabal and Stack support, maybe we can consolidate our efforts.

The PR was closed because we need to discuss requirements first. @secustor requested that I discuss the manager first, so that what I am focusing on in https://github.com/renovatebot/renovate/discussions/31493 .

@rarkins Wouldn't it be ok if an initial version of the Hackage data source didn't support digests? Hash pinning is optional in stack.yaml. I'd argue that most dependencies are declared with a numeric version bound, and not with hash pinning.

@pbrisbin Are you still interested in implementing this? What do you think about the data source I drafted, do you think it could help you?

viceice commented 2 months ago

Since part of this discussion is about adding a Hackage data source, I thought I'd link #31434, which contains such a data source. Since the data source is necessary for both Cabal and Stack support, maybe we can consolidate our efforts.

The PR was closed because we need to discuss requirements first. @secustor requested that I discuss the manager first, so that what I am focusing on in https://github.com/renovatebot/renovate/discussions/31493 .

@rarkins Wouldn't it be ok if an initial version of the Hackage data source didn't support digests? Hash pinning is optional in stack.yaml. I'd argue that most dependencies are declared with a numeric version bound, and not with hash pinning.

@pbrisbin Are you still interested in implementing this? What do you think about the data source I drafted, do you think it could help you?

you should start on versioning. then datasource and finally on manager.

ysangkok commented 2 months ago

@viceice Your order is inconsistent with what I was suggested in #31434 (which is to discuss the manager first). Also, not everyone uses PVP. So I don't understand why versioning need to be done first.

wolfgangwalther commented 2 months ago

Your order is inconsistent with what I was suggested in #31434 (which is to discuss the manager first).

I read https://github.com/renovatebot/renovate/pull/31434#issuecomment-2362130840 differently. It says "discuss manager first, then implement version, datasource, manager in that order in separate PRs".

ysangkok commented 1 week ago

I think #32298 with PVP support might get merged soon-ish, and I have the datasource PR ready, which I will submit at that point. It will use Hackage endpoints like https://hackage.haskell.org/package/base.json, and it would only use the keys of the object. This endpoint isn't currently document anywhere, but since I work on hackage-server, I could add that documentation. I'd probably add a manually written OpenAPI spec. The datasource PR would be fairly small it's basically just a map over the JSON keys, with little branching.