Why do we maintain our own changelog generator?

bryceosterhaus commented 4 years ago

I saw we added a new package and added some new features for it that essentially mimics the same functionality as conventional commits. I am curious why we need our own tool and why we don't just leverage that instead of rolling our own?

I read the README on the package but it still wasn't totally clear to me why we would replicate the features from CC and not leverage it.

Anyways, I'm likely missing some background on this, so forgive me of my ignorance. I thought it might be helpful to have this explanation on the repo as well.

wincent commented 4 years ago

Why do we maintain our own changelog generator?

Because the one that we were previously using in liferay-js-themes-toolkit was horribly broken. It routinely failed to produce changelogs at all because it would hit GitHub API rate limits, and even when the rate limits didn't apply, it was confused by the multiple active branches (because we were cutting 8.x and 9.x releases at the same time).

I figured we could make something much simpler and more reliable by looking at the Git commit graph instead of connecting to the GitHub API. The idea is to just look at the merge commits, extract the descriptions from them (which correspond to the PR titles), and use those to make the changelog. The result was a single, short changelog.js file with zero dependencies, which got merged into liferay-js-themes-toolkit in https://github.com/liferay/liferay-js-themes-toolkit/pull/239 (March 2019).

We started informally using the changelog.js in Alloy Editor in https://github.com/liferay/alloy-editor/wiki/Contributing/8e480aba9b1e5dd4042615093ee60f3ed7ba3e74 (April 2019), and in eslint-config-liferay in https://github.com/liferay/eslint-config-liferay/commit/5a9063b455248a7eba4ccab02f91d4e6be3708fd (June 2019). It proved useful and reliable enough there, so I decided to extract it into an actual NPM package here as liferay-changelog-generator in https://github.com/liferay/liferay-npm-tools/pull/222 (September 2019).

It continues to be a single file with zero dependencies.

essentially mimics the same functionality as conventional commits. I am curious why we need our own tool and why we don't just leverage that instead of rolling our own?

Conventional Commits is a specification and not a tool. We ask that people follow the spec in our guidelines, and use the Semantic Pull Requests bot to provide enforcement.

We started following the specification in our projects around the same time as we made the changelog generator. Around March 2019 in this repo, and in Alloy Editor, around April 2019 in liferay-js-themes-toolkit, and in Clay, and around Jun 2019 in eslint-config-liferay.

Soon after starting to use Conventional Commits, I created https://github.com/liferay/liferay-js-themes-toolkit/issues/258 (March 2019) to track extracting the Conventional Commit metadata from the PR titles to create richer changelogs. I always knew it would be a trivial change (and in the end, it was) because it just means grouping the changelog entries together based on their type ("feat", "fix" etc), which was easily done.

When we converted changelog.js into a real NPM package, I ported the issue over here in https://github.com/liferay/liferay-npm-tools/issues/227 (September 2019). The other issue I created at the same time was https://github.com/liferay/liferay-npm-tools/issues/226, which was the one about making the changelog generator work well in monorepos that had independently versioned packages (ie. like this one, where we'd never published changelogs before), because the existing use cases all had changelogs only at the repo level.

All of which is to say, it doesn't "mimic" Conventional Commits but leverages the information from it.

If there is an existing tool that can generate high-quality changelogs for us and which doesn't have the problems that the old github_changelog_generator, please let us know. In general, however, I think such a tool would have to meet a high standard of simplicity and reliability to make it better than liferay-changelog-generator... Precisely because we got so badly burned by github_changelog_generator, we made liferay-changelog-generator dramatically, radically simple: it doesn't try to be clever, it just operates on the principle that if you have good PR titles (and you should anyway!), and you divide your work into conceptually coherent pulls (and you should anyway!), then you'll get a good (or at least a reasonable) changelog out the other side, and it will all Just Work™. At its heart, it is a wrapper around an invocation of git-log.

The maintenance cost of liferay-changelog-generator seems pretty low to me: it is, as I said, a single file with zero dependencies. If you look at the history in this repo and in its former home, you'll see only a small number of relatively trivial changes in its lifetime, which has consisted of a year of active use across multiple projects:

Plus the two PRs that you just saw go in over the weekend (#398, #399), which were really just itches that I wanted to scratch (like many of the changes in this project, to be honest).

Relative to the costs that come with relying on external dependencies, that looks pretty cheap to me.

I thought it might be helpful to have this explanation on the repo as well.

Does the first paragraph in the README cut it?

A crude and unsophisticated script for generating or updating CHANGELOG.md based on the local Git history. It was born out of frustration with other tools that were limited by GitHub API throttling or unable to cope with repos with multiple active branches.

If you have any ideas for how this could be made more obvious (ie. adding a link to this issue, or to some other place where there is more context?) please share.

wincent commented 4 years ago

It seems like the most popular JS "ecosystem" of tooling that operates with the Conventional Commits spec is under the conventional-changelog org. Their main repo is a monorepo containing 21 packages. The package-lock.json is 12,000 lines long. Their standard-version package has 15 top-level dependencies. Installing standard-version in an empty project downloads approximately 283 packages (to be fair, some of which get deduped) and writes over 2,600 files onto the file system.

I don't know whether it is fair to describe it as "over-engineering", but maybe you could describe it as a 10,000 ton hammer to drive in a 2-inch nail. 🤷‍♂

bryceosterhaus commented 4 years ago

Makes sense, I actually think I made an assumption before asking the question that wasn't quite accurate. I saw the "monorepo" support and other recent issues that made it look very similar to lerna version --conventional-commits which will auto generate a changelog per package as well. But I realized that even though we have a monorepo here, we don't use lerna which then makes my assumptions all fail 🤦‍♂️

Does the first paragraph in the README cut it?

My suggestion for the readme would be to clarify why why we use liferay-changelog-generator in some repos, but why we would just use lerna or other tools in other repos. Unless your goal is to replace all other tools with this one.

Overall, I'm not against having our own, but I do have hesitancy of continuing the re-invent the wheel by creating our own tools just because we don't want more dependencies. I understand that we can definitely bloat dependencies very quickly, but on the flip side it just gives us more more of a footprint to maintain.

wincent commented 4 years ago

Overall, I'm not against having our own, but I do have hesitancy of continuing the re-invent the wheel by creating our own tools just because we don't want more dependencies. I understand that we can definitely bloat dependencies very quickly, but on the flip side it just gives us more more of a footprint to maintain.

Yeah, I get that. It's a reasonable opinion, and I've thought about it a lot over the years. However, the more time I spend in the JS ecosystem, the more I think we, as a community, tend to get it all wrong, over and over again. Things like the 2,600-file changelog generator, or Lerna (10,976 files added to an empty project as a result of yarn add lerna — and it is even worse if you look at it in a development context, where a yarn after cloning the Git repo produces a directory with 18,481 files), are utterly dystopian in my view.

Counter-examples: yarn add react react-dom adds 112 files — pretty good "bang for your buck". Or projects like Babel which are incredibly complicated but whose complexity arguably matches the complexity of the functionality that they have to provide.

Sadly, it is much easier to find negative examples than positive ones. 😢 Combine that with the cavalier way in which projects break compatibility all the time (as if bumping the semver major version somehow exculpates you of any responsibility for the pain that it causes your users) makes living with these tools in the long term a nightmare.

Unless your goal is to replace all other tools with this one.

Not an explicit goal, but in every project where I've had to create releases, I have started using this tool to produce changelogs. It is lightweight, it just works, and I have every reason to believe that if I start using it in a new project 5 or 10 years from now, it will probably work exactly the same as it does now.

My suggestion for the readme would be to clarify why why we use liferay-changelog-generator in some repos, but why we would just use lerna or other tools in other repos.

Well, it's kind of organic. I think we added Lerna to a bunch of repos simply because we thought that:

Monorepos are "the way" to modularize software projects; and:
Lerna is "the way" to manage monorepos.

But since then, we started using Yarn (whose workspaces feature obviates a large part of the utility of Lerna), and we simplified a number of our projects (eg. by removing packages) to the point that Lerna wasn't even useful for release management, so we removed it in a number of places (eg. in this repo, in the liferay-js-themes-toolkit, in liferay-js-toolkit, not sure if there are any others that I am forgetting). AFAIK, the only actively-developed repo where we're still using Lerna is Clay, and if you still find it useful there, then that's obviously valid and your call.

wincent commented 4 years ago

I added #404 (if you can find it... 😂) which adds something to the README.

liferay / liferay-npm-tools

Why do we maintain our own changelog generator? #403