apache / incubator-annotator

Apache Annotator provides annotation enabling code for browsers, servers, and humans.
https://annotator.apache.org/
Apache License 2.0
218 stars 44 forks source link

Drop the entanglement of dependency fetching as part of the build #161

Closed mrcolbyrussell closed 8 months ago

mrcolbyrussell commented 9 months ago

Describe the refactoring action

Pull request #157 closed issue #154 to simplify the build process by using npm instead of yarn for fetching dependencies.

The current state of the project is described in the README "dependencies are automatically installed as part of the build[...] npm run build -- builds the project".

Not requiring a separate step to fetch dependencies isn't a bad affordance for people who want to be able to get annotator built with a single command after cloning, but it is inconvenient wrt other scenarios in the fact that getting all the code is itself not a single step—you fetch part of the code when you clone the repo and then the rest of it (i.e. the dependencies) right before build. This is not unusual for npm-based projects, but it is inconvenient all the same (and unnecessary).

Rather than requiring folks who are merely building annotator fetch dependencies on top, we can dictate that maintainers be responsible for fetching them (at the same time they're that they're affirming, after a package has been changed upstream, that its newest release releases does in fact comprise part of a platform on which annotator can be built).

Expected benefit

Being able to build offline—as long as you've already cloned the repo, then you have what you need to build the project (common expectation up until ~10 years ago, when people started making build scripts hit the network to dynamically fetch other pieces).

Reproducible builds—under the status quo, a developer may successfully build and deploy annotator and then at some later point another developer (or the same person, even) might check out the exact same tag/branch/commit and not be able to successfully complete the build because a dependency has changed upstream. (package-lock.json tries to solve this, but it doesn't, and it's just one big attempt to work around the fact that npm install and similar workflows are designed to keep people from being able to get dependencies at the same time that they're getting the application code.)

See also:

reckart commented 9 months ago

Just my 10ct on this...

I'd opt for sticking with installing dependencies when required and keeping them out of the repo - using the package.lock to avoid undesired changes of dependencies through people attacking upstream.

Artifact repositories like pypy, npmjs or maven central are quite reliable and under normal circumstances, once something has been published there, it does not go away. Use of smaller upstream repositories that may not be as reliable, or direct dependencies on even GitHub repositories as npm allows them should imho be avoided.

Unnecessary dependencies should also be avoided.

I believe most of annotator's dependencies are pursuant to its build tooling. IMHO it makes no sense to keep that in the repo. But for a fully autark build that would be necessary, and then also for different platforms, etc. It is a rabbit hole.

Also, having dependencies in the repo gives incentive to not upgrading them regularly, leading to dependency rot.

Better update dependencies with every release (as applicable and sensible) and release regularly.

If there are no releases, there is no maintenance. If there is no maintenance, a library should not be used downstream anymore. If there are no users, a reproducible build is moot.

mrcolbyrussell commented 9 months ago

dependencies when required and keeping them out of the repo - using the package.lock

What's package.lock

Artifact repositories like pypy, npmjs or maven central

This is not a Python or JVM project, nor is it a broad philosophical discussion. This is about apache/incubator-annotator.

having dependencies in the repo gives incentive to not upgrading them regularly, leading to dependency rot

...

Better update dependencies with every release

That doesn't describe what's happening now. Dependencies can change from build to build; today, Developer A can clone the repo at 10:01 AM and run the build while Developer B clones the repo at 10:13 AM and runs the build and they get different results because the build script is not a really a build script—it's entangled with dynamically fetching missing pieces of the source tree, which change upstream from time to time. (Again, this isn't unusual for NPM-based projects, but it being the norm is not a substitute for an argument on its own merits.)

mrcolbyrussell commented 9 months ago

update dependencies with every release

(This is, in fact, what I'm asking for—and which isn't currently happening.)

reckart commented 9 months ago

By package.lock I mean package-lock.json.

Correct me if I am wrong: If package-lock.json is present and the npm build is using npm ci instead of npm install to install the dependencies, the packages are installed exactly as prescribed by the package-lock.json every time.

mrcolbyrussell commented 9 months ago

I'm not going to say whether that's correct or incorrect because it's off-topic (a red herring).

PR #157 simplified the build process because Yarn was basically boondoggle. It was an unnecessary requirement.

Here's a simple exercise; think about the answers to the following questions: what problem does package-lock.json solve? Why does that problem exist? (As in, "Where does it come from?")

https://www.teamten.com/lawrence/programming/dont-invent-unnecessary-requirements.html

tilgovi commented 9 months ago

The last paragraph of your first comment started with a call for reproducible builds. Whether or not package-lock.json totally provides that is, as you say, maybe off topic, but it certainly aims to make the build more reproducible.

So which is it, a requirement you have or an unnecessary requirement?

Do you have a use case? Are you packaging annotator somewhere where you'd like to do an offline build?

Are you engaging in good faith? I'm not yet able to see why you've opened this issue, except perhaps to rail against package managers in an inappropriate venue.

tilgovi commented 9 months ago

Yarn was basically boondoggle

Ironically, yarn actually has support for checking in all the archives of dependencies and having no install step.

tilgovi commented 8 months ago

Closing because the current behavior is intentional. Please open a separate issue if you have a real packaging need we can assist you with and we'll try our best to help.