Unclear need for the encapsulation

tc39 / proposal-class-fields

Orthogonally-informed combination of public and private fields proposals

https://arai-a.github.io/ecma262-compare/?pr=1668

1.72k stars 113 forks source link

Unclear need for the encapsulation #286

Closed kosich closed 4 years ago

kosich commented 4 years ago

"Why is encapsulation a goal of this proposal?" section of the FAQ contains three points describing why the language needs the private fields.

It is important to have the reasoning well considered and well documented. So in this issue I want to discuss the details and will try to argue with some of the points made in the source.

There's probably a lot more pros and cons discussed in the comments across the issues, though it's hard to find them. I'll take the abovementioned document as the official reasoning behind the feature.

Here are the points:

Library authors have found that their users will start to depend on any exposed part of their interface, even undocumented parts. They do not generally consider themselves free to break their user's pages and applications just because those users were depending upon some part of the library's interface which the library author did not intend them to depend upon. As a consequence, they would like to have hard private state to be able to hide implementation details in a more complete way.

Can we get any insight on:
Popularity of the libs
How they implement the "hiding" now
What is the percent of devs using lib internals

This is needed to understand and document the severity of the problem being solved:

was that really blocking authors from internal upgrades?
should the language adapt to those libraries needs?

Especially, considering solutions in the next two points:

While it's already possible to model true encapsulation using either per-instance closures or WeakMaps (see below), both approaches are unergonomic to use with classes and have memory usage semantics which may be surprising. Furthermore, per-instance closures don't allow instances of the same class to access each other's private fields (see above), and with WeakMaps there's a non-obvious risk of exposing private data. WeakMaps are also likely to be slower than private fields could be.

"WeakMaps are also likely to be slower than private fields could be" — are there updates on real performance?

Next to this point, there's an example of "non-obvious risk of exposing private data", that intentionally sabotages encapsulation:

const Person = (function(){
  const privates = new WeakMap;
  let ids = 0;
  return class Person {
    constructor(name, makeGreeting) {
      this.name = name;
      privates.set(this, {id: ids++, makeGreeting});
    }
    equals(otherPerson) {
      return privates.get(this).id === privates.get(otherPerson).id;
    }
    greet(otherPerson) {
      return privates.get(this).makeGreeting(otherPerson.name);
    }
  }
})();
let alice = new Person('Alice', name => `Hello, ${name}!`);
let bob = new Person('Bob', name => `Hi, ${name}.`);
alice.equals(bob); // false
alice.greet(bob); // === 'Hello, Bob!'

At first glance this appears fine, but:

let mallory = new Person('Mallory', function(name) {this.id = 0; return `o/ ${name}`;});
mallory.greet(bob); // === 'o/ Bob'
mallory.equals(alice); // true. Oops!

I think this example is not valid to prove the point, since:

Here id is intentionally exposed via constructor accepting makeGreeting function which is then needlessly added to the privates object.

We can fix that example by assigning the function to the instance, instead of the stored object. Or by binding the function to the instance, instead of the stored object. Or by using a standalone WeakMap for each property. Or by using existing library for encapsulation via WeakMaps by @zenparsing. Or if the library class already implements some kind of "IDisposable" — use it to free any storage with our private data.

OR by using Symbols, which leads us to:

"Hidden" but not encapsulated properties are likewise already possible through the use of Symbols as property names (see below).

Heres a one way to hide props via Symbols:

// A.js
const f = Symbol();

class C {
  [f] = 42;

  answer() {
    return this[f];
  }
}

export { C }

// index.js
const c = new C();
c.answer()

While the f symbol can be accessed via Object.getOwnPropertySymbols, is this really not enough to fix the # 1 problem? How many library users would go for getOwnPropertySymbols without knowing what they are risking?

With all that said, it feels like there's some need for "hard private" that is not transparent.

I hope that clarification of the above points and detailed reasoning (reflected in the documentation) will help the community (myself included) understand the necessity of the feature and easier accept the changes. Thanks

ljharb commented 4 years ago

The percent of devs using internals is > 0, which is more than sufficient for me.

For example, npm once broke node by removing an underscore-prefixed property. Encapsulation does not work by convention - someone not accessing it “yet” just means you haven’t realized the problem yet.

rdking commented 4 years ago

@ljharb I've asked this before but never received a clear answer:

How is it the fault of the language when a developer does something ill-advised?

NPM breaking Node.js by removing some internals that Node.js developers were ill-advisedly using is par for the course. Fostering an environment where devs can blame providers for breaking them by taking away something that's not intended to be part of the public API and denoted as such by convention is a situation that is almost guaranteed to hold back the entire language. While a heavy problem indeed, this should not be the only (or even primary) reason for wanting to provide for encapsulation.

@kosich Having said all of that to @ljharb, the need for hard-private does indeed go beyond the obvious. At the same time, it's all related. Enabling hard-private encapsulation has these benefits:

API clarification: If the author doesn't want you to touch it, you can't (unless you're willing to customize).
Guarantees: The author is allowed to rely on the guarantee that unless modified, certain elements of the code can only be manipulated in specific ways as dictated by that developer. This has the potential of leading to cleaner, easier to understand, faster running code due to the reduction of unnecessary cross-checks.
Library choice: Due to the encapsulation, devs needing the library's features, but finding an incompatibility, will be more likely to either:
1. Find a more compatible library
2. Fork the library and make it more compatible
3. Make suggestions/submit patches to the author to enable more features.

All of this is a good for the language.

loganfsmyth commented 4 years ago

How is it the fault of the language when a developer does something ill-advised?

I think framing it as a question of who is at fault ignores the part that is most important. It ends up coming down the the scope of the damage.

Fostering an environment where devs can blame providers for breaking them by taking away something that's not intended to be part of the public API and denoted as such by convention is a situation that is almost guaranteed to hold back the entire language.

If a broadly-used project happens to use API that is meant to be private, they are certainly at fault, but they aren't the ones directly affected if that API were removed, and the users of the broadly-used project are.

As the developer of the private API, your choices end up as:

Screw the users of this library by publishing the private API change anyway
Be forced to treat this private API as a public API until you release a new major and move or hide it somehow

As far as I'm concerned, the first option is essentially never an option, since I wouldn't want to be on the receiving side of that breakage and wouldn't want to inflict it on others. It doesn't matter that your API was private or undocumented, it matters whether you want to screw people over.

This is why I 100% support Jordan's "The percent of devs using internals is > 0, which is more than sufficient." point. Either something is entirely inaccessible, or else every developer of every piece of code published for broader use need to accept that there is some possibility that any change to any accessible value could break someone. The issue is that you have no way of knowing how many people that could be until it happens and every one of those people come to yell at you for screwing up their entire day. Removing this as a category of potential dangers sounds absolutely fantastic.

bigopon commented 4 years ago

As far as I'm concerned, the first option is essentially never an option...

Breakage like that is fixable/ can be workaround or simply avoided, but there's nothing you can do once something like this proposal is in. That argument sounds weird to me.

nicolo-ribaudo commented 4 years ago

but there's nothing you can do once something like this proposal is in.

This proposal aims to make your code encapsulated enough that it doesn't matter if there is nothing you can do to workaround the breakage, because there can't be any breaking change caused by internal APIs in the first place.

bigopon commented 4 years ago

The response above is mainly for this issue #106 , which reads like this: with this proposal, just for the sake of encapsulation, harmony is gone. Any single private field can break unlimited amount of proxies usage, this is what you can do nothing about

nicolo-ribaudo commented 4 years ago

Harmony was already gone years ago if you use proxies without properly unwrapping your obejcts:

class Person {
  constructor(parent) {
    this.children = [];
    if (parent) parent.children.push(this);
  }

  isChildOf(parent) {
    return parent.children.includes(this);
  }
}

let mum = new Person();
let me = new Person(mum);

assert(me.isChildOf(mum)); // ok

let proxiedMe = new Proxy(me, {});
assert(proxiedMe.isChildOf(mum)); // broken?

Anyway, please let's keep this different discussion in #106.

rdking commented 4 years ago

@loganfsmyth

I think framing it as a question of who is at fault ignores the part that is most important. It ends up coming down the the scope of the damage.

I framed it this way for a particular purpose. You're about to see why.

As the developer of the private API, your choices end up as:

Screw the users of this library by publishing the private API change anyway

Be forced to treat this private API as a public API until you release a new major and move or hide it somehow As far as I'm concerned, the first option is essentially never an option, since I wouldn't want to be on the receiving side of that breakage and wouldn't want to inflict it on others. It doesn't matter that your API was private or undocumented, it matters whether you want to screw people over.

As far as I'm concerned, option 2 is just as bad as it prevents you from making forward progress in your own code because you're essentially being broken by a third party. That's the mistake in the reasoning where I'm concerned. Other devs hacking into private logic for their "popular" project are basically breaking your ability to advance your code. Then when you push the damage back on them, you catch the fall-out.

In your analysis, you missed some choices. As soon as you realize someone did something foolish related to your code that you feel is too costly to damage, you can:

Work with them to correct their bad code before releasing your updates.
Use a SEMVER Major version update so that the developers writing irresponsible code will recognize that there may be changes that will break them.

No matter how bad the options look, not creating better code due to how your code is being used by 3rd parties is far worse... but maybe that's just me.

rdking commented 4 years ago

@nicolo-ribaudo

Harmony was already gone years ago if you use proxies without properly unwrapping your objects:

That harmony wasn't broken as there are non-breaking, non-unwrapping ways of doing the same thing.

class Person {
  constructor(parent) {
    this.id = Symbol();
    this.children = [];
    if (parent) parent.children.push(this.id);
  }

  isChildOf(parent) {
    return parent.children.includes(this.id);
  }
}

let mum = new Person();
let me = new Person(mum);

assert(me.isChildOf(mum)); // ok

let proxiedMe = new Proxy(me, {});
assert(proxiedMe.isChildOf(mum)); // ok!

Choosing to use a non-viable approach is not proof that all approaches are non-viable.

But I do agree that this argument needs to be kept in #106. I only placed this rebuttal because of the highly flawed argument presented.

ljharb commented 4 years ago

"Work with them" presumes that they're willing and able to report it, and also to do the work. Just because they aren't complaining - or aren't willing to do extra work - doesn't make it any less true that you broke them in a non-major version.

Obviously you can change it in a semver-major - that's the point, it's a breaking change because the thing you changed wasn't actually private. Encapsulation (closures, private fields, etc) allows these sorts of changes to be made without a semver-major change.

rdking commented 4 years ago

@ljharb

"Work with them" presumes that they're willing and able to report it, and also to do the work. Just because they aren't complaining - or aren't willing to do extra work - doesn't make it any less true that you broke them in a non-major version.

Incorrect. They broke themselves by writing ill-advised code. What you're not getting here is that there is an incorrect mindset you're fostering by the way you choose to describe the situation. Would Node.js have been broken by the NPM update if they had not made such a poor choice in how to construct their code? If your answer is "No" then congratulations, you've chosen to believe that the Node development team broke their own code. I'm just asking that we all view things in this light and act accordingly. To think otherwise, you're going to have to answer to why the NPM development team is responsible to Node's user base for the contents of the Node development software.

Obviously you can change it in a semver-major - that's the point, it's a breaking change because the thing you changed wasn't actually private. Encapsulation (closures, private fields, etc) allows these sorts of changes to be made without a semver-major change.

I'm aware of this, and that's why I'm all for a properly implemented encapsulation strategy (something I think you're well aware that I believe this proposal isn't, but that's a different argument). My point is simply that even though a fairly standardized approach was used to specify that the affected code was not meant to be accessed, Node developers chose to break the convention. They chose to risk having their code broken. They chose to risk having their user base experience hardship due to a possible alteration by NPM developers. That's not NPM's responsibility. Nor should it be thought of as such.

That proper encapsulation can solve this issue is great. However, that's simply not the primary purpose of encapsulation. If that's the only reason for adding encapsulation, then it serves little purpose in the language. Fortunately, as I stated before, there are other, much better reasons for adding some proper form of encapsulation.

kosich commented 4 years ago

Seems like the discussion was mainly held around point 1 of the original reasoning.

Library authors have found that their users will start to depend on any exposed part of their interface, even undocumented parts. ...

Today we have package-lock and npm cli to ensure that things don't break automatically while a version update, even minor. These are the tools to proof against unexpected failures.

So, I can't strongly agree that just >0 is enough.

Also, as mentioned in other threads, apart documentation, we have typing systems to provide additional user-guidance to help them avoid fields meant to be private.

With abovesaid, and my arguments against points 2 and 3 of the original reasons (that we have fine workarounds for the problem), I think:
the feature reasoning can and should be improved, IMO

the need for hard-private does indeed go beyond the obvious

@rdking , my whole point is to clarify and publish the reasoning behind this. I've already commented on API clarification and guarantees.

Regarding:

Library choice: Due to the encapsulation, devs needing the library's features, but finding an incompatibility, will be more likely to either: [ i, ii, iii ]

I'd hope that was always the case, but it seems like wishful thinking. I could also imagine that they will switch to, say, C# compiled to webassembly and screw JS all at once.

I know the authors must've done tremendous work on the feature.
I'm asking for stronger reasoning, that would be a better basis for community.

🎄

ljharb commented 4 years ago

@rdking It's not your place to tell them what code is "good" or "bad", well or ill advised, responsible or irresponsible. If it was possible to use your software in a certain way, then you permitted it, whether intentionally or not. That's what it means to publish software (or create code that others reuse) - everything they can do is something you let them do.

By publishing code that can be misused, you chose to risk breaking the ecosystem - it's not the fault of the consumers.

rdking commented 4 years ago

@ljharb

It's not your place to tell them what code is "good" or "bad", well or ill advised, responsible or irresponsible.

Actually, as a senior developer with multiple decades of experience, as a systems architect, as a mentor, as a project manager, as a lead developer (all roles I've held over the years), it's precisely my place. As a language developer (your role as a TC39 member), it's your place as well. It is exactly the job of every experienced developer to influence what is considered to be good or bad practice. This is how we pass our skills along.

That being said, that's not what I've done. I didn't tell Node developers that their actions are ill-advised. Instead, in here, as a matter of professional discourse, I've claimed that using portions of a system that are not documented for use, or is otherwise noted by convention as "internal", is an ill-advised action that can jeopardize the future functionality of a product. Unless the goal is to risk the future functionality of the product, my use of "ill-advised" is both appropriate and accurate. Put another way, I am simply stating my opinion on their action of disregarding the internal nature of the code they decided to rely on.

If it was possible to use your software in a certain way, then you permitted it, whether intentionally or not.

While I won't deny the truth of this statement, I will also advise you not to ignore the caveat that comes along with it, that being something like: "However, do not be surprised if your use of undocumented or internal code becomes invalid in a future release."

This is the same situation faced by application developers who use undocumented OS interfaces to take advantage of hidden capabilities in an Operating System. A new release usually breaks them, and the developers find themselves either needing to find a new way to accomplish the same thing or simply give up. It's not on the OS developer to provide support for such interfaces just because a particular developer decided to use them against the advise of the OS developer.

Either way. Believe what you will. After all, it's not my place to dictate what is good or bad practice, right? Good thing I don't need to as decades of development by a planet full of developers tends to settle those issues for me.

littledan commented 4 years ago

Many people in this thread have explained the importance of encapsulation, so I think this question has been answered thoroughly. PRs are welcome to explain further in supporting documents, but IMO the description is already clear.

kosich commented 4 years ago

@littledan , in my initial and latest comment I've tried to explain that:

a) we now have a rich toolset to protect projects from unexpected fails (typing systems, npm ci / package-lock, documenting comments)

b) js already has ways to implement "light" encapsulation (symbols, closure weakmaps)

And, clearly, people who don't have enough evidence for private fields, won't create PRs. People who root for it, don't seem to see a need for such PRs. So, it's a dead end.

I understand that authors mean better for the language and the community. I'm asking for better Public Relations here.

littledan commented 4 years ago

@kosich I definitely agree that "> 0" is a not sufficient reason to add anything to the language. We made this addition only after hearing that compatible, non-semver-major changes to JS ecosystem libraries and frameworks is a major problem in the ecosystem, despite the existence of those available mitigations you're talking about.

There's definitely more we could to to explain this feature better to the community. Many of us are active in explanations in the proposal README, GitHub threads, giving talks at conferences, blog posts, podcasts, etc., but of course there's always room for improvement. I'd welcome your help here.

littledan commented 4 years ago

Overall, I'm not really happy with the back-and-forth harsh value judgements in the above thread. I agree with @kosich that this isn't a great way of working with the community. I see ecosystem breakage, compatibility issues, and general difficulty evolving software as things that just happen, and that we could choose to constructively provide incremental improvements to mitigate.

Nobody's in a position to force these things on JavaScript programmers. We've instead made an effort to make a feature that will be attractive to JS programmers to have available when they decide they want it.

rdking commented 4 years ago

@littledan

Overall, I'm not really happy with the back-and-forth harsh value judgements in the above thread.

I understand your perspective, but as has been pointed out to me with regard to this proposal and the decisions behind it, sometimes value judgements are all that's left to base a decision on. In those cases, it is only prudent to be clear about what is and what is not valued. The simple fact that there are conflicting personal opinions within TC39:

The percent of devs using internals is > 0, which is more than sufficient for me.
I definitely agree that "> 0" is a not sufficient reason to add anything to the language.

means that when questions like the one raised by the OP appear, an unclear picture gets painted. The primary means of clarification? Value judgements.

We've instead made an effort to make a feature that will be attractive to JS programmers to have available when they decide they want it.

Thank you for the effort. However, issues like the one in this thread come about as a result of lacking documentation, especially with regard to the constraints that led to the decisions behind this proposal. You can't rely on the FAQ for this as it doesn't even come close to explaining why some of the more peculiar decisions were made.

littledan commented 4 years ago

I don't think TC39 made a decision based on this "> 0" being sufficient idea. It's a big group, and like the JS community as a whole, there's going to be diversity of opinion; that doesn't mean we can't come to shared conclusions. We have made the value judgement, as a committee, that this feature is worth it: I see Stage 2 as meaning this.

For documentation, we're continuing to make documentation available in more places (e.g., conference talks, blog posts and MDN) and improving this over time; the FAQ will contain frequently asked questions, and we've been answering further questions in issues.

rdking commented 4 years ago

@littledan

I don't think TC39 made a decision based on this "> 0" being sufficient idea.

I don't either. That's just 1 member's opinion on why it is necessary. Imagining the whole board (at least the part of it that cared to have a say) thinking that way is a bit much.

It's a big group...; that doesn't mean we can't come to shared conclusions.

I get that, but the way I see it, there are several subgroups in TC39 with regard to these "shared conclusions.

Those who didn't care.
Those who didn't like this proposal (can be subdivided). a. Those who were willing to give open dissent. b. Those who weren't willing to risk the political capital.
Those who were ok with this proposal (can be subdivided) a. Those who didn't care to fully understand the issues with the implementation. b. Those who thought the issues were ignorable because those issues didn't affect their desired use cases. c. Those who wanted the feature, but thought the issues to be worth it just to get the feature.

This is the norm for this kind of decision. I recognize that there are other subgroups, but they usually don't play as large a role in these situations. What I'm interested in seeing documented is the consensus the justification of group 3.c. I would like to understand why they believe no better solution can be devised. I do believe that giving this kind of information, especially over such a controversial topic will do much to calm the nerves of those who think this proposal will cause more harm than good (like myself) as well as provide additional documentation and justification to those who simply want more information, like the OP.