renovatebot / renovate

Home of the Renovate CLI: Cross-platform Dependency Automation by Mend.io
https://mend.io/renovate
GNU Affero General Public License v3.0
16.86k stars 2.2k forks source link

Centralize data to lib/data for easier contributing #29942

Open rarkins opened 2 months ago

rarkins commented 2 months ago

Describe the proposed change(s).

We have a lot of "crowdsourced" data in the repo, spread through different locations.

For example:

We should centralize this into lib/data so that it's easier for one-time or occasional contributors to find the right location to edit.

We should keep this folder as "raw" as possible and any wrapper code exists elsewhere. One topic per file. Files should be in .json format.

There should be a readme in the folder which clearly describes what each file/dataset is for.

RahulGautamSingh commented 2 months ago

The data should be in a form which is as easy for users to contribute as possible, instead of matching our internal formats. e.g. instead of forcing packageRules, we should keep them simpler if possible.

I do not understand this, can you share an example?

rarkins commented 2 months ago

I was thinking "we shouldn't make people write large packageRules if they don't need to" but now I see that most of our data is simple enough, so I will drop this comment.

rarkins commented 1 month ago

I added a requirement that the data be in .json format. @viceice is that still ok for our build process?

viceice commented 1 month ago

should work, maybe use jsonc to allow comments?

rarkins commented 1 month ago

I'm worried that jsonc isn't easily parseable by other ecosystems. For example we may internalize these in Mend systems using a Java backend

HonkingGoose commented 1 month ago

When you're moving and renaming files, remember to update the "edit button links/overrides".

The published docs have a edit button which takes you to the file on GitHub. For some files the default path assumption of the edit button tool is wrong, so we have manual overrides. I remember we have overrides for things like the Renovate preset source files, and the readme.md files for the Renovate managers, and so on.

RahulGautamSingh commented 1 month ago

I don't think the metatda-manual for the packages have been documented yet.

How should we go about documenting this? Should they be included in the Included Presets section like the rest of the presets?

RahulGautamSingh commented 1 month ago

@HonkingGoose can you review the first draft of the readme file? I am trying to gauge what information should be included for each file. Currently I have added preset description, why the preset is needed and how preset is organized or new one is added. For eg. monorepos are organized based on sourceUrls and packagePatterns.

Readme ```md The `lib/data` folder houses a collection of crowdsourced data files (presets) that are useful for various automated actions. Such as, grouping related packages using monorepo presets, replacing renamed packages using the replacements presets or using the manual sourceUrl and changelogUrls to provide changelog urls for the packages which do not include them in their api repsonse. Below, you'll find detailed information on each file contained in this folder: 1. `monorepo.json` The monorepo.json file houses all the monorepo presets. These presets are used to group related packages together. The reason why package might be related differs from user-to-user but generally it is done because the packages depend on each other or the they are located in the same location (repo or org). We currently support three methods for grouping packages: `repoGroups`: Groups packages based on their source repository URLs. `orgGroups`: Groups packages based on their organization URLs. `patternGroups`: Groups packages based on their package names. ```
HonkingGoose commented 1 month ago

Hi @RahulGautamSingh

I improved and expanded your draft. The readme is easier to read and has more information now.

Can you please do these todos?

First draft from HonkingGoose

```markdown # Introduction The `lib/data` folder has all our crowdsourced data files. This readme explains what each file is used for. ## Summary | File | What is the file about? | | --------------------------------------- | ---------------------------------------- | | `monorepo.json` | Group related packages into a single PR. | | `filename-for-replacement-presets.json` | Rename old packages to new replacement. | | `filename-for-changelogs.json` | Tell Renovate where to find changelogs. | ## Group related packages (`monorepo.json`) The `monorepo.json` file has all the monorepo presets. Monorepo presets group related packages, so they are updated with a single Renovate PR. We usually group packages that: - depend on each other, or - are in the same repository, or - are in the same organization ### Ways to group packages There are three ways to group packages: | I want to group based on | Method | | ------------------------ | --------------- | | Source repository URLs | `repoGroups` | | Organization URls | `orgGroups` | | Package name(s) | `patternGroups` | ## Rename old packages The `filename-for-replacement-presets.json` file has all the replacement presets. When a package gets renamed, you need to tell Renovate: - the old package name - the new package name - add anything I'm forgetting to list here ## Tell Renovate where to find changelogs The `filename-for-changelogs.json` has all the changelog information. Renovate nearly always finds, and displays, the changelog for a package update automatically. To find the changelog, Renovate needs the: - URL to the changelog file - URL to the source Usually, the API for the package to be updated gives Renovate the correct info. If this does not happen, for whatever reason, Renovate can not show the changelog. You can use these config options to let Renovate find the correct changelog: - [`sourceUrl`](https://docs.renovatebot.com/configuration-options/#sourceurl) - [`changelogUrl`](https://docs.renovatebot.com/configuration-options/#changelogurl) Read the [Renovate docs, key concepts page for changelogs](https://docs.renovatebot.com/key-concepts/changelogs/) to learn more about how Renovate fetches and displays changelogs. ```

HonkingGoose commented 1 month ago

How about:

RahulGautamSingh commented 1 month ago

Explain how to use repoGroups, orgGroups and patternGroups. I don't see them in the Renovate docs.

I think the Way to group packages section you added and a quick glance at the monorepo.json file will be enough for users to figure it out.

Here's an update version of the `readme` ```md # Introduction The `lib/data` folder has all our crowdsourced data files. This readme explains what each file is used for. ## Summary | File | What is the file about? | | ------------------------------ | ---------------------------------------- | | `monorepo.json` | Group related packages into a single PR. | | `replacements.json` | Rename old packages to new replacement. | | `changelogs.json` | Tell Renovate where to find changelogs. | | `source-urls.json` | Tell Renovate the source URL of packages.| ## Group related packages (`monorepo.json`) The `monorepo.json` file has all the monorepo presets. Monorepo presets group related packages, so they are updated with a single Renovate PR. ### Ways to group packages There are three ways to group packages: | Grouping Criteria | Method | | ------------------------ | --------------- | | Source repository URLs | `repoGroups` | | Organization URls | `orgGroups` | | Package name patterns(s) | `patternGroups` | Each method allows you to group related packages based on different criteria: `repoGroups`: Group packages from the same source repository. `orgGroups`: Group packages from the same organization. `patternGroups`: Group packages based on name patterns or prefixes. ## Rename old packages (`replacements.json`) The `replacements.json` file has all the replacement presets. When a package gets renamed, you need to tell Renovate: - the datasource of the package - the old package name - the new package name - the last version available for the old package name - the first version available for the new package name ## Tell Renovate where to find changelogs (`changelog-urls.json`) The `changelog-urls.json` has all the changelog information. Renovate nearly always finds, and displays, the changelog for a package update automatically. To find the changelog, Renovate needs the: - Name of the package - URL to the changelog file Usually, the API for the package to be updated gives Renovate the correct info. If this does not happen, for whatever reason, Renovate can not show the changelog. You can use these config options to let Renovate find the correct changelog: - [`changelogUrl`](https://docs.renovatebot.com/configuration-options/#changelogurl) Read the [Renovate docs, key concepts page for changelogs](https://docs.renovatebot.com/key-concepts/changelogs/) to learn more about how Renovate fetches and displays changelogs. ## Tell Renovate where to find source urls (`source-urls.json`) The `source-urls.json` has the infromation on source URL of multiple packages. Renovate nearly always finds, and displays, the source for a package update automatically. Usually, the API for the package to be updated gives Renovate the correct info. If this does not happen, for whatever reason, Renovate can not link to the source of the package and might not be able to lookup changelogs. To find the source URL, Renovate needs the: - Name of the package - URL to the source To verify if Renovate can find source URLs for your package: 1. Identify the datasource your package uses. 2. Check the documentation page for that specific datasource. 3. Look for a table in the docs that indicates whether the datasource returns source URLs. You can use these config options to let Renovate find the correct source URL: - [`sourceUrl`](https://docs.renovatebot.com/configuration-options/#sourceurl) ```

I have divided the metadata-manual info into 2 json files: changelog-urls.json & source-urls.json for better readability & navigation.

Also, regarding the metedata-manual files. This info is not documented, yet. Should it be documented in the Included Presets section?

HonkingGoose commented 1 month ago

Answers to your questions

I have divided the metadata-manual info into 2 json files: changelog-urls.json & source-urls.json for better readability & navigation.

Good! Please make sure the filenames in the readme are correct!

Also, regarding the metadata-manual files. This info is not documented, yet. Should it be documented in the Included Presets section?

It should at least be documented somewhere. I don't know the best place, so for now put it in the Included Presets section. 😉

Todos

Can you please make these changes?

Put info in table, or make bulleted list

There are three ways to group packages:

Grouping Criteria Method
Source repository URLs repoGroups
Organization URls orgGroups
Package name patterns(s) patternGroups

Each method allows you to group related packages based on different criteria:

repoGroups: Group packages from the same source repository. orgGroups: Group packages from the same organization. patternGroups: Group packages based on name patterns or prefixes.

Please move the explanation into the table. But if that would make the table too big: make a bulleted list for the items. 😉

Fix typo

Please fix the typo: change infromation to information.

Rewrite section

Change this:

To verify if Renovate can find source URLs for your package:

  1. Identify the datasource your package uses.
  2. Check the documentation page for that specific datasource.
  3. Look for a table in the docs that indicates whether the datasource returns source URLs.

You can use these config options to let Renovate find the correct source URL:

Into this:

To check if Renovate can find the source URLs for your package:

1. Find the datasource for your package.
1. Read the Renovate docs for the datasource.
1. Look for a table in the docs that shows if the datasource returns source URLs.

If Renovate does not find the right source URls automatically: use the [`sourceUrl` config option](https://docs.renovatebot.com/configuration-options/#sourceurl).
RahulGautamSingh commented 1 month ago

I have created PRs for all files mentioned in the description. How should we proceed further?

viceice commented 1 month ago
  1. create JSON Schema for better user ux
  2. validate data files against schema when linting