linkeddata / rdflib.js

Linked Data API for JavaScript
http://linkeddata.github.io/rdflib.js/doc/
Other
566 stars 146 forks source link

Move solid-auth-client to peerDependencies #361

Closed Vinnl closed 4 years ago

Vinnl commented 5 years ago

With solid-auth-client as a regular dependency, npm will give rdflib its own instance of it when its version no longer aligns with the one the app itself has installed. This can happen e.g. in the following situation:

npm install solid-auth-client;
npm install rdflib;
# <New version of solid-auth-client is released>
npm upgrade solid-auth-client;

Now node_modules/rdflib/node_modules/solid-auth-client contains a copy that is still pinned to the older version, whereas node_modules/solid-auth-client was upgraded.

I'm quite sure that that's what will happen, and that that's not intended to happen, but to be completely sure I'm also pinging @dmitrizagidulin, who I think knows most of solid-auth-client (?), and @RubenVerborgh, who added the warning about multiple instances of solid-auth-client being loaded.

Edit: For those coming across later and not wanting to read the entire back-and-forth between Ruben and me below, the summary is that Ruben is not against merging this, but thinks that it will not completely solve the problem, and thinks that peerDependencies are not meant for this use case.

RubenVerborgh commented 5 years ago

I don't think it fits the bill:

In some cases, you want to express the compatibility of your package with a host tool or library, while not necessarily doing a require of this host.

https://docs.npmjs.com/files/package.json#peerdependencies

But we do a require.

npm dedupe is probably the answer.

Vinnl commented 5 years ago

I don't think asking consumers to run npm dedupe every time they upgrade is a reasonable requirement.

Here's the situation as I see it:

Here's what I believe the situation will be if this PR is merged:

The latter situation seems clearly preferable to me. Is there anything in the behaviour of peerDependencies that would make the latter situation not be true, or is there anything else that would be true in the latter situation that would make it less desirable than the current one?

RubenVerborgh commented 5 years ago
  • @timbl himself often just throws away his entire node_modules and package-lock.json only to reinstall the latest version of everything! My hunch is that this is the reason.

Complex dependency trees with local links are the main reason there. And that problem is not solved with peer dependencies.

  • People using npm who don't have solid-auth-client as a dependency will get an error at bundle time that says solid-auth-client isn't installed.

They shouldn't; it's an optional dependency that falls back to regular fetch.

is there anything else that would be true in the latter situation that would make it less desirable

We're working around an npm bug by compensating for it with another bug in package.json.

Vinnl commented 5 years ago

We're working around an npm bug by compensating for it with another bug in package.json.

Two things:

  1. I'm not sure why this would be an npm bug? From npm's point of view, this is intended behaviour: it's conflict resolution mechanism is to give each package its own version in case of conflicts.
  2. When the choice is between passing a bug on to library consumers, or maintaining a "workaround" (though I'd argue that this is the intended use of peerDependencies) ourselves, it's both far more efficient and responsible maintainership to do the latter.
RubenVerborgh commented 5 years ago

in case of conflicts

So when there are none, and it still does this, it would seem like a bug.

When the choice is between passing a bug on to library consumers, or maintaining a "workaround"

If that were the choice, yes.

However, it still does not prevent us from ending up with multiple versions of the same package. Marking it as peerDependency in rdflib is arbitrary; we'd have to do it everywhere. npm simply has no solution for avoiding multiple versions of the same package.

Vinnl commented 5 years ago

So when there are none, and it still does this, it would seem like a bug.

Right, but it only does this when there are conflicts? i.e. rdflib is still pinned to an older version of solid-auth-client, whereas the dependency in the top-level project is upgraded.

However, it still does not prevent us from ending up with multiple versions of the same package.

From the rdflib point of view, it does, right?

Marking it as peerDependency in rdflib is arbitrary; we'd have to do it everywhere.

Yes, I think any project that plugs in to solid-auth-client's fetch should mark it as a peerDependency rather than a regular one, and I'm not arguing that other projects do not make this change. (Except for the actual top-level applications, of course.)

npm simply has no solution for avoiding multiple versions of the same package.

Well... There's peerDependencies ;)

RubenVerborgh commented 5 years ago

rdflib is still pinned to an older version of solid-auth-client

Then that would seem like the bug to me. If it is pinned, then there is no way to satisfy the dependency in the first place.

From the rdflib point of view, it does, right?

There was never a problem from the rdflib point of view; only when multiple packages are combined.

I think any project that plugs in to solid-auth-client's fetch should mark it as a peerDependency rather than a regular one

peerDependency does not mean "there should be only one version of this package": https://nodejs.org/en/blog/npm/peer-dependencies/. Again, I dislike fixing one bug by introducing another.

npm simply has no solution for avoiding multiple versions of the same package.

Well... There's peerDependencies ;)

It has not been created to prevent multiple versions of the same package. npm dedupe and yarn are.

Vinnl commented 5 years ago

If it is pinned, then there is no way to satisfy the dependency in the first place.

It was satisfied, and then that version was pinned in either package-lock.json or yarn.lock. If it's a bug, both Yarn and npm suffer from it.

There was never a problem from the rdflib point of view; only when multiple packages are combined.

Sure, but in this case rdflib is designed to be combined with the other package. A use case that, in the current setup, eventually always leads to duplicate versions of solid-auth-client and the app developer having to run npm dedupe (and to know they need to run it).

peerDependency does not mean "there should be only one version of this package": https://nodejs.org/en/blog/npm/peer-dependencies/.

No, it means: I want this library to be supplied by the app depending on me, and I want that to be in this version range.

From the post you linked:

Even for plugins that do have such direct dependencies, probably due to the host package supplying utility APIs, specifying the dependency in the plugin's package.json would result in a dependency tree with multiple copies of the host package—not what you want.

That seems to explicitly describe our problem.

Again, I dislike fixing one bug by introducing another.

Could you articulate what undesirable behaviour apps using rdflib would start exhibiting if these changes were published?

It has not been created to prevent multiple versions of the same package. npm dedupe and yarn are.

Yarn doesn't solve that; consumers that use Yarn need a third party tool to do this.

RubenVerborgh commented 5 years ago

It was satisfied, and then that version was pinned in either package-lock.json or yarn.lock.

OK; I thought pinned in package.json.

eventually always leads to duplicate versions of solid-auth-client

That's strong, I never had that (only when using npm ln, but that case is not solved by peer dependencies).

Even for plugins that do have such direct dependencies

That seems to explicitly describe our problem.

No; rdflib is not a plugin to solid-auth-client.

Yarn doesn't solve that; consumers that use Yarn need a third party tool to do this.

Mmm, there goes a good reason to use Yarn :-( I thought it provided such guarantees.

Could you articulate what undesirable behaviour apps using rdflib would start exhibiting if these changes were published?

That's a good point; the only problem seems to be "not having authenticated access". So not that bad. I do like that solid-auth-client is also introduced as a dev dependency with this PR.


Summarizing: I am mainly opposing because we are (ab)using a feature created for another purpose, which does not solve the problem, but at best makes it occurrence more rare (which one think would be good, but it makes things less predictable, and knowledge about npm dedupe remains necessary). In particular, the aforementioned problem, which is caused by npm link, is not solved, because solid-auth-client still remains as a development dependency.

I am not objecting merging of this PR, as long as we know it does not solve all problems, and npm dedupe is still needed as a solution in several cases. Hence, my actually preferred solution would be to just list npm dedupe, which we probably want to do anyway as that case can and will still occur.

timbl commented 5 years ago

I agree in general with @RubenVerborgh 's arguments. Peer dependencies do not do what we want. That is not what they are designed for. I have had lots of problems over the years with multiple copies of solid-auth-client or of solid-ui (which mush each be unique packages).

This is going to be a pain. It changes the rdflib API interface, as it changes the thing everyone who uses rdflib.js has to put in their package.

How about throwing a hard error if these packages are loaded twice? At the moment, solid-auth-client gives a console warning.

npm has many issues around this. What we want is that people should use the latest versions of rdflib and of solid-auth-client, so long as the versions of each are compatible with each other. We should make sure that do not make minor releases of solid-auth-client which will block users upgrading one dependency, and force npm to make two copies.

Another possibility is to allow access to the solid-auth-client API as a feature inside the RDF API as $rdf.authClient or something. That could be the simplest way for real users. Obviously you have to make sure you only have one rdflib too.

But I am against using a peer dependency here. If yarn does not have this problem, happy to try yarn.

npm has many problems. It doesn't seem to have any sense of the build being a repeatable function of the repos. It has files like package.json and package-lock.json which are both inout files and output files, and a node_modules tree state which affect each other in a way in way which does not guarantee anything consistent. Having a unique constraint on a module you would think would be a common need.

Vinnl commented 5 years ago

This is going to be a pain. It changes the rdflib API interface, as it changes the thing everyone who uses rdflib.js has to put in their package.

This is true, though as a library consumer, also having to install solid-auth-client (which I will probably do anyway) is far less of a pain for me than my application suddenly breaking for users after a version upgrade.

How about throwing a hard error if these packages are loaded twice? At the moment, solid-auth-client gives a console warning.

To me, it's vastly preferable to get an error at install-time rather than at runtime - especially since in the latter case, the error is usually propagated to the user, rather than the developer.

What we want is that people should use the latest versions of rdflib and of solid-auth-client, so long as the versions of each are compatible with each other.

The reason npm doesn't do this, is because as an application developer, that's usually not what you want. You want to be in control over what gets upgraded when, rather than blindly trusting the library author to correctly predict whether a change is going to be breaking for you - because that prediction will never be correct 100% of the time. So you mitigate this by upgrading the minimal set of dependencies at a time, to make it easier to pinpoint the cause of eventual breakage. Thus, if npm has given rdflib a version of solid-auth-client that satisfies its demands, and it works, then it won't change that version unless the developer explicitly asks it to do so - even if the application code itself wants a higher version of solid-auth-client.

We should make sure that do not make minor releases of solid-auth-client which will block users upgrading one dependency, and force npm to make two copies.

So any release of solid-auth-client will force npm to make two copies. Let's say I first have:

- solid-auth-client@1.0.0
- rdflib@1.0.0
    |
    - solid-auth-client@1.0.0

In this case, npm will see that the two versions are equal, and dedupe them to the top level. However, if then solid-auth-client v1.0.1 gets released, and I run npm upgrade solid-auth-client on the top level, we get this:

- solid-auth-client@1.0.1
- rdflib@1.0.0
    |
    - solid-auth-client@1.0.0

...because v1.0.0 still satisfies the version range for rdflib. Only if you were to wipe your lock file and node_modules, then reinstall, will npm install v1.0.1 only, because then it doesn't have information of which version worked for rdflib before.

Thus, there is nothing we can do w.r.t. solid-auth-client's release policy to prevent that from happening, other than never releasing a new version again.

Another possibility is to allow access to the solid-auth-client API as a feature inside the RDF API as $rdf.authClient or something.

That would work too, for this problem. The main disadvantage I see is that library consumers are then dependent on rdflib upgrading its dependencies if they want to use new features in solid-auth-client. In other words, every time solid-auth-client releases a new minor version, rdflib would need to do so as well, and if solid-auth-client releases a breaking change, rdflib will also need to release a new major release. In other words, solid-auth-client's API will become part of rdflib's API. This could be considered acceptable.

If yarn does not have this problem, happy to try yarn.

Yarn also "has this problem", in the sense that it would occur just the same if you were using Yarn, and everywhere I said "npm" above, you can read "npm and Yarn". But I'd argue that the problem is not the package manager, but our use of it. This behaviour by npm and Yarn is in response to real-world problems, and not some arbitrary decision. It does indeed make this use-case harder (though that's why peerDependencies were introduced...), but is safer for the common use case of modules not conflicting with themselves. The main problem is just that dependency management is difficult, with many trade-offs involved.

It doesn't seem to have any sense of the build being a repeatable function of the repos.

Interestingly, we're having this problem exactly because npm wants the build to be a repeatable function! But that fails when people throw away lockfiles.

It has files like package.json and package-lock.json which are both inout files and output files

package.json is the input (i.e. its contents are determined by the developer), package-lock.json is the output (i.e. its contents are determined by the dependency tree npm determined for the developer). Or am I misunderstanding what you mean by input and output?

and a node_modules tree state which affect each other in a way in way which does not guarantee anything consistent.

If you run npm install on the same lockfile using the same version of npm, you should always get the same node_modules. Likewise for Yarn. That's the intent, and if that's not what's happening, that should be considered a bug and reported upstream.

Having a unique constraint on a module you would think would be a common need.

The trade-off there is that if you enforce a unique constraint, you're making it impossible to upgrade dependencies independently. So for example, if both package A and B depend on package C, and C should be unique, then if a new version of A gets released that depends on a new version of C, you can only upgrade it once B releases a new version that is also expecting that version of C - and you can only upgrade them together, resulting in potential breakage being harder to spot again.

npm made the trade-off of duplicating modules in case of conflict, preventing that problem, and causing others - as it goes with trade-offs. Bower did the opposite, and that was a massive pain.

Sorry for the wall of text, but I really feel that a misunderstanding of how npm works or what it intends to do is causing a lot of problems for us in general, so I'd like us to be in agreement about how we use our tools.

RubenVerborgh commented 5 years ago

So any release of solid-auth-client will force npm to make two copies.

Sorry for jumping in again, but that's not correct. It all depends on how you upgrade.

It seems that you upgrade by npm upgrade solid-auth-client. That's one way to do it, but I seldom use it as it indeed leads to some of the problems mentioned above.

I upgrade with ncu, or sometimes by just rm -rf node_modules package-lock.json; npm install.

I really feel that a misunderstanding of how npm works

When working on client-side libraries especially, you want to always ensure you have a deduped tree, or you end up with libraries that are too big.

Vinnl commented 5 years ago

It all depends on how you upgrade.

Agreed, but I would argue that when we're distributing our code through npm, we should at least be optimising for the common practices and expectations users of npm have. And that is upgrading using the standard CLI, and sometimes using Yarn, which behaves in the same way - for the reasons I mentioned above, to avoid the problems I mentioned there.

When working on client-side libraries especially, you want to always ensure you have a deduped tree, or you end up with libraries that are too big.

Again, there's a trade-off - in this case, between risking the introduction of bugs and a temporary drop in performance. It's a perfectly valid and also common strategy to upgrade library A first, temporarily have duplicate versions of library C, and then later upgrade library B as well to again end up with a single version of library C. That's how most npm projects are run, so that's who we should optimise for, IMO.

RubenVerborgh commented 5 years ago

we should at least be optimising for the common practices and expectations users of npm have.

Definitely agree; it's just that we don't know what these are. I've pretty much never used npm upgrade in the past decade (only when npm audit suggested it, actually); other devs might do things differently.

jeff-zucker commented 5 years ago

Not sure how relevant this is to the npm issues, but something worth keeping in mind is that solid-auth-client is not the only or last auth module for Solid. Currently solid-auth-cli substitutes for it in a nodejs context. @Happybeing has created a SAFE version of auth for his projects. In the future there may be other auth packages that fit seamlessly where solid-auth-client now sits. I would like to see a situation where rdflib defaults to solid-auth-client in the browser and to solid-auth-cli in nodejs with possibility of the user bringing their own auth instead. This is how it currently works. The bring-your-own path is supported by being able to override rdflib.fetch(). So as we move toward a solution to the dupes and clumsy updating process. I hope we can make sure to keep these other possibilities available.

On Thu, Oct 10, 2019 at 2:17 AM Ruben Verborgh notifications@github.com wrote:

we should at least be optimising for the common practices and expectations users of npm have.

Definitely agree; it's just that we don't know what these are. I've pretty much never used npm upgrade in the past decade (only when npm audit suggested it, actually); other devs might do things differently.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/linkeddata/rdflib.js/pull/361?email_source=notifications&email_token=AKVJCJGFRCWHSUSDFYQWHNLQN3XMFA5CNFSM4I6NGQ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA3Q7PA#issuecomment-540479420, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKVJCJGXJTI7PDCMIQPTJA3QN3XMFANCNFSM4I6NGQ5A .

Vinnl commented 5 years ago

Hi @jeff-zucker, that would certainly still be possible. Although now that you mention it, an explicit dependency injection pattern for solid-auth-client might be a cleaner pattern in general, so that instead of rdflib calling require('solid-auth-client'), it exposes a fetch option in the constructor for Fetcher and UpdateManager, that defaults to the global fetch, removing the hard dependency on solid-auth-client and the need to monkey-patch rdflib if you want to use something else. This should also make it easier to migrate in the future if we do ever decide to rewrite solid-auth-client. @RubenVerborgh what would you think of that?

(The disadvantage is that this would be an API change and hence would be a breaking version for everyone. But fans of modular code might feel this is cleaner :) )

RubenVerborgh commented 5 years ago

@RubenVerborgh what would you think of that?

Preaching to the choir here 🙂 I'm all for injection.

The disadvantage is that this would be an API change

You can still keep solid-auth-client as default in the 1.x range.

rescribet commented 5 years ago

Tagging on this PR which I noticed while having troubles with this in a test environment due to this line which currently creates the hard requirement;

import { fetch as solidAuthCli } from 'solid-auth-cli'
const fetch = typeof window === 'undefined' ? solidAuthCli : solidAuthClient

transpiles to

var _solidAuthCli = require("solid-auth-cli");
var fetch = typeof window === 'undefined' ? _solidAuthCli.fetch : _solidAuthClient.fetch;

So if those packages aren't available it will trip with a TypeError for .fetch on undefined

Patching the constructor to the following should bring it a step closer to DI;

import { fetch as solidAuthCli } from 'solid-auth-cli'
import { fetch as solidAuthClient } from 'solid-auth-client'

const fetch = typeof window === 'undefined' ? solidAuthCli : solidAuthClient

class Fetcher {
constructor() {
  this._fetch = options.fetch || fetch
}

to

import solidAuthCli from 'solid-auth-cli'
import solidAuthClient from 'solid-auth-client'

class Fetcher {
constructor() {
  this._fetch = options.fetch || this.defaultFetch()
}

defaultFetch() {
  const solidFetch = typeof window === 'undefined' ? solidAuthCli : solidAuthClient;
  if (typeof solidFetch !== 'undefined') {
    return solidFetch.fetch // Notice the `.fetch`, which will solve the error
  }

  // Default to the web standard if none other are available :) 
  return typeof window === 'undefined' ? window.fetch : undefined
}

so that instead of rdflib calling require('solid-auth-client'), it exposes a fetch option in the constructor for Fetcher

Probably misunderstanding something as this is already implemented(?)

megoth commented 5 years ago

Just to follow this one up - do I understand it correctly that we want to make the auth client into an injectable, with solid-auth-client injected by default if nothing else is injected?

If so, should we perhaps create a new issue to track this development, and close this PR?

Vinnl commented 5 years ago

@megoth With solid-auth-client injected by default if nothing else is injected and it's available. And if Thom's last line is correct (I haven't checked), then it basically comes down to this PR, I think. But I'll check definitively what's needed if Tim approves of the plan.

michielbdejong commented 4 years ago

I'll look into this.

michielbdejong commented 4 years ago

Trying this out, including @fletcher91's fix, in my update-deps branch.

return typeof window === 'undefined' ? window.fetch : undefined

Looks like a typo, I think that's a should read:

return typeof window === 'undefined' ? undefined : window.fetch
michielbdejong commented 4 years ago

Merged as part of #425.