RFC-8996 Clarify the scope of public APIs

robbieaverill commented 5 years ago

Affected Version

All

Description

Lately there have been some discussions in pull requests around whether we prefer private or protected methods (when not public).

Key questions:

What constitutes our public API (and our commitment to semantic versioning)?
As above, are protected methods part of our API?
- Note that most of our module readme files already state that only public methods are part of our public API, while we aim to keep backward compatibility in protected methods as well
Do we prefer protected or private methods?

Acceptance criteria:

The regular product developers and the SilverStripe core committers have a common understanding of the above points
The decisions are documented in our contributing guide

cc @silverstripe/core-team @silverstripe/open-sourcerers @silverstripe/creative-commoners

Outcomes

Protected scope IS public API

Protected methods and properties MUST NOT receive incompatible changes in minor and patch releases.

Pros

this is the most natural option for developers
eliminates some anti-patterns, makes APIs more predictable, projects easier to upgrade
doesn't introduce special rules over what PHP OOP capabilities provide
doesn't increase framework learning curve
would make patch and minor releases more stable

Cons

This is different from the current undocumented convention
Can make it harder to "fix" issues in minor and patch releases
Some issues potentially may become unfixable in patch/minor releases

PRs

dnsl48 commented 5 years ago

most of our module readme files already state that only public methods are part of our public API

I wouldn't say about most of them. I only saw it once so far in asset-admin.

we aim to keep backward compatibility in protected methods as well

To me this implies they are public API too?

robbieaverill commented 5 years ago

I wouldn't say about most of them. I only saw it once so far in asset-admin.

At a quick glance I've found this segment in admin, asset-admin, campaign-admin, versioned, versioned-admin and userforms, but I haven't gone through the full CWP suite - I think it's enough to solidify the statement that this is the status quo at the moment though:

All methods, with public visibility, are part of the public API. All other methods are not part of the public API. Where possible, we'll try to keep protected methods backwards-compatible in minor/patch versions, but if you're overriding methods then please test your work before upgrading.

To me this implies they are public API too?

I don't think so, to me it implies that we make an effort to keep them compatible, but you customise them at your own risk.

dnsl48 commented 5 years ago

TLDR; I believe protected methods are a part of public API as long as it's a natural PHP behaviour and supported by the language. Amending the rules of PHP is counter intuitive for most developers and will lead to bad consequences.

What constitutes our public API (and our commitment to semantic versioning)?

I think if we declare protected methods NOT to be a part of our API, that will have very evil repercussions. This goes completely against the OOP principles which are natural for PHP devs. I believe that most framework users will not anticipate such rule and try to use protected things as usual. In that case upgradeability of projects will suffer and make things broken even in patch releases (which can undermine our semantic versioning as well).

Do we prefer protected or private methods?

I believe this should be decided on a case by case basis. However, trying to prefer one over the other does not make much sense to me as long as we want to be able to practice OOP and have encapsulation and polymorphism in our tooling. If we always make things public (or protected), we'll be more often breaking O, L and I in SOLID. I'm not against duck typing, but PHP devs will not expect this to be the case.

dhensby commented 5 years ago

Related #3888

My preference is that we should only be using private for properties that have getter/setters and protected for methods unless there's a well thought-out reason for using private (which there definitely can be)

To answer the questions:

What constitutes our public API (and our commitment to semantic versioning)?

At the moment we have a rule of thumb that public methods form our SemVer supported API. This is really just so we can bridge the gap between our documentation and our commitment to SemVer (which is explicitly about "public APIs"). Really, it's a bit more nuanced than that and every risky PR will have a discussion about how appropriate it is with regards to the version it goes in. There's no silver bullet answer to this except for improving our documented API.

are protected methods part of our API?

Probably... As above

Do we prefer protected or private methods?

Private methods should only really be used for internals that we definitely won't want other developers to be overloading or modifying. I think this is quite rare but can be the case when there's a very specific internal use. Most likely it'll be when breaking a larger method (public api) into logically separated pieces of code.

I think a good example is a CSV parser: we provide a public API for parsing CSVs, if we like we can change the backend parser lib we use and (as long as our interface doesn't change) this is not a breaking change. As part of our CSV parser may have some private methods that we use to interface specifically with our CSV library of choice and intentionally don't wish to expose that as an API surface because it's not consumable in a meaningful way outside of our parser lib.

maxime-rainville commented 5 years ago

I'm with @dnsl48.

If the point of protected methods is to make it easy to extend core classes by overriding/calling those method, than it doesn't make sense to not consider those methods part of our API ... we'll cause problems for module/project maintainers who make use of this flexibility we gave them.

A recurring problem I see in our code, is we'll have big juicy public methods that are 50-60 lines long. My instinct when I see those is to split them up into smaller private methods. The point of this is to make the code more readable, not to favour code reuse.

Occasionally, I get into arguments with folks who would want me to make those methods protected. From my perspective, all of this is internal logic to the class. You can't make sense of an individual method without understanding how it relates to the others. If you're going to override the original public method than you really ought to override the entire thing.

If you're overriding a protected method and you need to refer back to the original class' code to figure out what you're doing, than that method should probably need to be private.

micmania1 commented 5 years ago

This has been talked about before in some regard: https://github.com/silverstripe/silverstripe-framework/pull/3888

I'd agree with @maxime-rainville. protected is definitely a public facing API. If its not, then put final on it.

As an example, Laravel explicitly uses protected methods as part of its public API: https://laravel.com/docs/5.8/authentication Symfony 2 API changes (stolen from linked PR): http://fabien.potencier.org/pragmatism-over-theory-protected-vs-private.html

I don't think it's sensible to make a blanket decision. It should be a case-by-case basis (ideally with some consistency). If the author introduces a protected method with the intention of it being used by projects/modules and can demonstrate a reasonable use case then that for me is a good reason for it to remain protected. Same goes for public. Otherwise, private is safer. Its much safer to change an API from private to protected (exposing functionality) than the other way around (removing functionality).

sminnee commented 5 years ago

The question here is "is it better to allow extensibility that might break in a minor release, or to not allow that extensibility at all?"

If you create a protected method then you're introducing an API where:

that protected can be overridden in a subclass to change behaviour
that produced method can be called in a subclass to provide some intermediary behaviour

So if we're serious about this, then these cases would be tested, documented, and committed to being preserved in minor releases. That's an investment, and it's not necessarily going to be the best place to invest our finite time.

If we're honest about the status quo, we just say "protected", but we treat it as "private" and accept that if people do weird things monkey-patching via this, that maybe their minor release upgrades won't be so smooth. We acknowledge this to some extent in our comments about "the public API are the public methods" for semver purposes.

It's messy, but it's who we are. Being more honest about that would be to make a protected method but mark it as @internal. Although, in a new project I would probably recommend using private and thinking more carefully about your API design, the status quo of SilverStripe is a long way away from that, and I'm not convinced that it's the most important thing about our architecture to fix.

In summary I would recommend:

True protected API methods should be documented, tested, and we commit to not breaking them, which we only do if the API is valuable enough to justify that effort.
For other methods I could live with either of the following:
- private methods
- protected methods marked @internal. This latter would get my vote for methods can be logically monkey-patched, as it's more in keeping with the project's status-quo and it would be invasive and not hugely valuable to changes
Existing protected methods that aren't covered by tests should be bulk-marked as @internal as anything out is a lie. :P

sminnee commented 5 years ago

My preference is that we should [use] protected for methods unless there's a well thought-out reason for using private

I think the logic is reversed on this: there needs to be more thought for deciding to using protected, as you're exposing it to extensibility.

dnsl48 commented 5 years ago

methods marked @internal

I would suggest to use these with caution. I'd say this is an anti-pattern as it may create "an alternative API" for core developers, so they won't be caring as much about flexibility of the public API anymore if their needs would be covered by @internal Ideally, API should be exactly same for core devs, module devs and users. In that case it'll be much easier to maintain and improve stuff without breaking other kids' toys.

Existing protected methods that aren't covered by tests should be bulk-marked as @internal as anything out is a lie. :P

This feels like it may introduce breaking changes to users. I reckon we should hold off with this until the next major release.

sminnee commented 5 years ago

Ideally, API should be exactly same for core devs, module devs and users. In that case it'll be much easier to maintain and improve stuff without breaking other kids' toys.

If we were starting a new project, I agree. But we've got 12 years of history built on the assumption that protected methods are providing some monkeypatching capability but aren't really a Public API in a semver sense.

This feels like it may introduce breaking changes to users. I reckon we should hold off with this until the next major release.

My view would be that anything we plan to privatise in SS5 (which would be tidier) should be marked as internal in SS4, which wouldn't break anything but would clarify our intent.

Frankly, any API that isn't covered by tests should be marked as @internal because it's going to be risky to rely on it not breaking between minor releases.

sminnee commented 5 years ago

The core point I'm trying to make is that marking things as @internal is an exercise in increased honesty, rather than a change to the API. And I am deeply opposed to the idea that we should get things into an ideal state before we add clarity.

sminnee commented 5 years ago

Revised views would be:

Mark everything not covered by tests and some docs (protected, but also public) as @internal.
Allow for the incremental adoption of PRs that add tests/docs and delete the applicable @internal tags. Whenever someone hates that step 1 marked their favourite API as internal, they can submit such a PR.

Still not decided on whether protected-internal or private is the better default for new APIs.

dnsl48 commented 5 years ago

Here's my understanding:

public = public / stable API
protected = public / stable API
@internal public = private / unstable API
@internal protected = private / unstable API
private = not API, but implementation details

public stable API - is the subject for semantic versioning. Cannot be changed or broken until a new major release.

private unstable API - could be changed in minor versions without warning. Should only be used by core developers. This is either hacks, or emerging APIs, which may become public in the future.

private - internal parts of components / classes. Encapsulated data or implementation bits that don't make sense without the context of the owner or precisely crafted for very narrow use cases and is used by other public or protected methods (e.g. for DRY or separation of concerns, code readability).

Mark everything not covered by tests and some docs (protected, but also public) as @internal

Still feels like a breaking change in the public API.

E.g. there's protected function getInfo() : array. People extend the class and override the method in their projects.

We mark it as @internal in 4.5 and as such we remove it from public API.
After that we won't be able to easily track whether it's been introduced as @internal since the very beginning of its existence, or we marked it as @internal in 4.5
We change the signature of that method in 4.6 as it's @internal and we can do that
Users upgrade from 4.4 to 4.6 and their project is broken.

Still not decided on whether protected-internal or private is the better default for new APIs

Why do we want a default though? Shouldn't we decide on a case-by-case basis what's API and what's not? Private access is just a tool for implementing encapsulation. When everything is open and accessible without need, that breaks encapsulation and makes the code base harder to refactor / change, because developers don't know if something else gets broken when it gets amended.

sminnee commented 5 years ago

You're assuming that if we don't mark it as internal we're going to be successful in keeping it functioning in the same way in new minor releases in spite of the lack of tests and docs, or even awareness that people are monkey-patching with it.

Is this a reasonable assumption?

Why do we want a default though?

After this sentence you've basically said "the default should be private". If we don't have a default, the choice will be "what was the bias of the developer who wrote this feature" which isn't a great way to decide. You just end up with an inconsistent codebase.

IMO we should choose one of the following two strategies, each of which allows for some case-by-case decision making.

STANDARD A (Codifying @dhensby's preference somewhat)

If something definitely needs to be extensible, make it protected and cover with it with tests & docs
Otherwise, make it protected and mark as @internal just in case someone wants to patch it

STANDARD B (Codifying @dnsl48 and @maxime-rainville and @micmania1 somewhat)

If something definitely needs to be extensible, make it protected and cover with it with tests & docs
Otherwise, if you need to access it in subclasses within your module, but users outside the module shouldn't rely on it, make it protected and mark as @internal
Otherwise, make it private

The standards as described are more rules of thumb than a formal process, but A and B have an essential difference and as a project I think we should choose one.

dnsl48 commented 5 years ago

You're assuming that if we don't mark it as internal we're going to be successful in keeping it functioning in the same way in new minor releases in spite of the lack of tests and docs, or even awareness that people are monkey-patching with it.

Is this a reasonable assumption?

Might be too idealistic, but I think marking things as @internal should be the last resort and ideally we should avoid it and try and add tests for those, rather than bulk-marking things as internal. Especially that protected things could only be covered through public APIs and as such lack of coverage for those will mean lack of coverage for public methods as well.

I'm not trying to challenge benefits of @internal, but rather suggesting we should try to keep things stable as much as possible as this has straight impact on people upgrading their projects.

On the other hand, to be pragmatic, we might agree, for example, that changes to protected methods in minor/patch releases are acceptable, but only as the last resort, must be covered as the API changes in the changelog and also should be marked as @internal when it's being done.

dnsl48 commented 5 years ago

Here's the link to the anti-pattern for a reference: https://en.wikipedia.org/wiki/Object_orgy

robbieaverill commented 5 years ago

As @dnsl48 pointed out in #9020, we don't have a documented policy around whether we use protected versus private.

I claimed on the PR that in lieu of having one we should continue to follow the status quo until one is decided on (in this RFC).

I've done a quick scan over a CWP 2.3 project (in the SilverStripe vendor folder), which yields the following:

1722 protected properties
219 private properties
1895 protected functions
166 private functions

I hope this is enough to substantiate my claim that using protected over private is the current status quo.

On to this RFC, I think having private properties is fine so long as the public accessors exist for them and don't have side effects in them.

I'm not sold on the idea of having complex logic in private methods, since it would require people to duplicate them in custom code - increasing userland maintenance responsibility as well as reducing the likelihood of automatically inheriting bug and security fixes. While I also acknowledge that it may also result in a higher likelihood of unintended behaviour changes when underlying classes change - I can see arguments for when this would happen in both private/protected cases.

Regarding the use of @internal, I think it should be used sparingly. From memory the recent times where we've used it are when we've had to introduce public API (note: public methods and classes) that we intend to remove again in subsequent releases, so we marked them as internal in order to prevent people from relying on them. I don't agree that we should use it to circumvent the use of private properties. Equally I don't think that explicitly tagging the public or protected methods or properties that we want to include in our semver commitments with @api (or the opposite of as @internal) is a good either either - we're just creating tag soup which is going to be annoying to maintain in future.

Lastly, I think making decisions around what people may or may not want to extend is dangerous for us to do as maintainers. I think this is probably why we've historically used protected over private in the first place. The idea of "composition over inheritance" is slowly becoming more standard in SilverStripe now, so the need to this is possibly decreasing, but while we have Injector we'll continue to need to provide extensible functionality for people to use in subclasses.

robbieaverill commented 5 years ago

Talking in person with @dnsl48 right now.

We discussed that when creating new APIs, we may prefer private over protected, as long as our classes are simple and single-responsibility orientated. In that case it'd be acceptable to replace FooService with MyFooService and know that it's your responsibility to define your logic for it, even if you physically extend FooService to re-use some of the API it exposes.

It's probably not as feasible to start introducing private over protected work working on existing parts of the API though.

chillu commented 5 years ago

From semver.org:

For this system to work, you first need to declare a public API. This may consist of documentation or be enforced by the code itself. Regardless, it is important that this API be clear and precise

There's a few ways that "public API" can be interpreted in our context, listed here from strongest to weakest:

Code marked public
Code marked public or protected
Code based on class interfaces (public or protected)
Code not marked @internal (public or protected)
Code that's covered by tests
Code that's mentioned in developer docs

I'm hoping that for new APIs, the majority of API surface you might want to extend is covered by class interfaces. In my opinion, that's a much better way to have "clear and precise" APIs, compared to protected vs. private.

Lastly, I think making decisions around what people may or may not want to extend is dangerous for us to do as maintainers. I think this is probably why we've historically used protected over private in the first place.

Yeah that's an interesting one: We don't have the amount of resources as e.g. the Symfony Project to define our APIs. We historically favoured extensibility, without assuming we've planned for every use case in the original API. Which was part of the attraction for devs to SilverStripe, it's an extremely customiseable CMS. But we're paying for that on the long run through an increasing amount of complexity and backwards compatibility efforts.

Marking existing APIs as @internal assumes that devs will have a way to discover this annotation. If you're using PHPStorm, that's somewhat built-in - but it still relies on type annotations for injected services. There's static analysers like PHAN which support @internal if you annotate your own code usage properly. I suspect that a decent portion of SilverStripe devs wouldn't know that they're using an internal method, and have no easy or reliable way to identify that for existing code.

I think we should agree on a common standard for new APIs (with some room for case-by-case discussions) - so STANDARD B. And be pragmatic and honest about existing APIs - so STANDARD A - the status quo. Retroactively marking existing APIs as @internal is tricky. I don't think "existing test coverage" makes an API public. It's a case-by-case decision, across 1000+ methods. If we decide to do that, I think we need to do this in one big push, alongside describing to devs on how they can identify if they're using previously accessible APIs which are now marked @internal.

I think Symfony's BC page lays this out really well. They're also using @internal by the way, although it sounds more of a matter of last resort, after applying good SOLID principles.

From Fabien's blog post:

Closing the API allows design flaws to be found more easily and gives you the opportunity to evolve your code by creating well defined extension points.

This implies that we collectively have the resources to evaluate when a community developer flags that a method shouldn't be private, and we can change it to protected (or add it to a class interface), and produce a new release in time for that code to still be valueable to said developer. That's a lot of extra tickets, at a time where we're already drowning in them, and can't keep up with merging pull requests. I think on balance, it's better than creating "just in case APIs" which then haunt us for a decade. But it's worth pointing out that a "private by default" stance comes with this level of commitment.

dnsl48 commented 5 years ago

we need to decide whether protected scope is our public API or not
if not, we have to explicitly conduct that information to users, stating it's not a subject of semver
depending on the decision, pick a default approach for new API (A or B)
depending on what standard we choose, we might need to decide what to do with the existing APIs
- do we want a bulk approach or maybe we're fine dealing with it on case by case basis whenever we need to change it
- if it's bulk, what exactly we want to do with it

Since the decisions in the topic

affect not only maintainers, but contributors and users
potentially may change the framework development practices
potentially contain global decisions around framework development, maintenance and project upgradeability

I'm changing it to impact/high if nobody minds.

I'm hoping that for new APIs, the majority of API surface you might want to extend is covered by class interfaces

Indeed, Interfaces are an important and flexible tool for declaring extension points and APIs. However, they are not covering protected code and we need to clearly define whether protected properties and methods are a part of our public API. Semantic versioning goes beyond particular programming paradigms and is applicable to functional programming too, as long as there's clear definition of public APIs. As such, I reckon we'll have to define what's our public API first. If we want to be able to change protected scope between minor releases, I believe it's very important we conduct this information to the users stating that they can't rely on those not to change between releases.

Lastly, I think making decisions around what people may or may not want to extend is dangerous for us to do as maintainers.

agree, with @chillu on that one that we're paying for that on the long run through an increasing amount of complexity and backwards compatibility efforts and want to reference @micmania1 post, which I think summarizes it quite well by referencing this: http://fabien.potencier.org/pragmatism-over-theory-protected-vs-private.html

I think we should agree on a common standard for new APIs (with some room for case-by-case discussions) - so STANDARD B. And be pragmatic and honest about existing APIs - so STANDARD A - the status quo.

Sorry, I'm not sure how these two can coexist easily unless we clearly state definitions of new API and existing API. If we saying that adding new methods to preexisting components is new API, then it might work. If we're talking about adding new methods to preexisting components should follow STANDARD A, but new components should follow STANDARD B, then it's not gonna work in my opinion. Not talking about framework users, but even maintainers will start mixing-up what components are new and what are old after a while.

This implies that we collectively have the resources to evaluate when a community developer flags that a method shouldn't be private, and we can change it to protected (or add it to a class interface), and produce a new release in time for that code to still be valueable to said developer. That's a lot of extra tickets, at a time where we're already drowning in them, and can't keep up with merging pull requests. I think on balance, it's better than creating "just in case APIs" which then haunt us for a decade. But it's worth pointing out that a "private by default" stance comes with this level of commitment.

Yes, there could be tickets for loosening up extension points. However, resolving those should be much less effort than the alternatives, which are

either not to be able to change things until the next major release so we don't break APIs
break things hoping nobody uses them and that would break stuff for people upgrading between minor releases
introducing new version of a method, but slightly different than preexisting ones, which would lead to bloating APIs and breaking DRY principles in our own code (even though we aim at solving that same DRY in other's projects by opening everything up)

robbieaverill commented 5 years ago

I think between all these comments we probably have a consensus. The tasks for this will be to document the expectations in our developer docs.

chillu commented 5 years ago

@dnsl48 or @robbieaverill: Do you want to either infer votes from existing comments, or ask for an explicit vote on one or more options with a summary of their tradeoffs? I think Serge captured the decision paths pretty well in his last comment. It's hard to say if we have consensus since there's so many different angles to this discussion.

dnsl48 commented 5 years ago

Gentlemen, I've put together the items for voting and some summary derived from the conversation above. Sorry, I'm biased, but please let me know if you want me to add something to the items.

There are 2 different topics to be considered:

is protected scope our public API
what's our default approach for adding new API

I think Vote 2 should be done separately and after the first one in finished, as its side-effects are dependent on the results of Vote 1.

Each option has its benefits and disadvantages. When it's decided what path we'd like to follow, we could separately discuss what can be done with the downsides of the approach taken.

Vote 1: Whether PROTECTED is a part of the public API

Option A: protected scope IS public API

protected methods and properties MUST NOT receive incompatible changes in minor and patch releases.

+
- this is the most natural option for developers
- eliminates some anti-patterns, makes APIs more predictable, projects easier to upgrade
- doesn't introduce special rules over what PHP OOP capabilities provide
- doesn't increase framework learning curve
- would make patch and minor releases more stable
-
- This is different from the current undocumented convention
- Can make it harder to "fix" issues in minor and patch releases
- Some issues potentially may become unfixable in patch/minor releases

React with 🚀

Option B: protected scope IS NOT public API

protected methods and properties may be changed in any ways in minor and patch releases. This implies that only public methods and properties may be relied on by framework users.

+
- makes it easier for core developers to introduce changes within minor/patch versions
- makes it easier to reuse framework internals without introducing new public APIs, but violating encapsulation
-
- introduces special rules for APIs that's unnatural for PHP OOP capabilities, increases framework learning curve
- introduces some anti-patterns, which in the long term makes code much harder for reasoning, refactoring and Design by contract
- discourages developers from establishing well designed APIs, since object internals are always available for use

React with ❤️

Option C: protected scope is unstable public API

protected methods and properties are a part of public API, but it may be changed in minor (and potentially patch) releases without warning whenever a core developer decides it is justified. This is our current undocumented convention.

+
- same as Option B
- doesn't introduce special rules over what PHP OOP capabilities provide
-
- same as Option B
- Makes our public APIs more convoluted and volatile. Makes it harder for framework users to reason about APIs
- Undermines our semantic versioning, makes it harder to upgrade to newer framework versions as some unexpected ad-hoc breaking changes might have been introduced to public APIs

React with 🎉

Vote 2 (after Vote 1 is finished): Default scope for new APIs

Protected is the default, private is forbidden.
Private is the default, protected only when core devs agree to make it public API.
No defaults necessary. Extension points to be chosen by contributing devs.

robbieaverill commented 5 years ago

Agreed that vote 2 should be taken after vote 1, since the outcome would influence my vote

ScopeyNZ commented 5 years ago

I'll put some "reaction options" into your message

robbieaverill commented 5 years ago

My only concern with option A is that we're basically adopting a large new API surface into our semver commitments, which were previously not part of it. It also reduces the likelihood that we'll use protected methods in future, likely preferring private instead. It still seems like the best way forward though.

Edit: my vote is for option C. While I agree with option A I think we could possibly use a middleground where we don't introduce many new protected APIs but continue to treat kegacy protected APIs as internal/unstable, which is the way it has been treated before. I agree that we should stop increasing the protected API surface and start using injectable interfaces with private methods instead.

sminnee commented 5 years ago

I voted against the clear majority but I can live with that. It wouldn't surprise me if ss5 ends up making a lot of protected methods private. It'll be cleaner but less extensible.

sminnee commented 5 years ago

If you're making a protected method, and you're not covering it in tests and docs testing its extensibility, you're reducing the stability of the project.

Given our preference for Option A, private should be the default. Protected requires much more thought, care, and work, which may not be your task at hand. And you can always upgrade it to a protected method later.

unclecheese commented 5 years ago

Agree. What I'm wondering is, what are the conditions that would make us choose protected over private? It seems to be an awkward middle ground with no clear use case.

sminnee commented 5 years ago

It comes down to "do you have the interest/time/energy to build a robust extensibility API for this?" It's the same decision as adding any other feature. The lie that we rid ourselves of is that we can get extensibility "for free" by making things protected over private.

unclecheese commented 5 years ago

It seems like the decision to make a method public is done out of necessity and design, but protected is done out of altruism. "I don't need to do this, and it makes things harder, but it might help somebody."

My guess is that protected methods become major edge case. But we'll see.

sminnee commented 5 years ago

Yeah, it's a design decision – what extensibility should this new thing have? We used to avoid the need for design decisions by leaving all the doors open, and then wondering why neighbourhood cats pissed all over our carpet during upgrades. :-P

ScopeyNZ commented 5 years ago

Here's how it should work (imo):

class MiscChecker
{
    public function checkThing(Thing $thing): bool
    {
        $thing = $this->prepThing($thing);

        return $this->checkA($thing) && $this->checkB($thing);
    }

    protected function checkA(Thing $thing): bool
    {
        return $this->extractDetail($thing) === 'yes';
    }

    protected function checkB(Thing $thing): bool
    {
        return true;
    }

    /**
     * I can remove this in the future - or change it to produce a new thing without caring
     */
    private function prepThing(Thing $thing): Thing
    {
        return clone $thing;
    }

    /**
     * Maybe later I decide that I don't need this service extracted to get this, I can just do everything in checkA
     */
    private function extractDetail(Thing $thing): string
    {
        return Service::create()->getStatement($thing);
    }
}

I clearly intend developers to be able to override this to introduce their own definitions for checkA and checkB. The stuff I don't want to be public API I don't give them access to.

sminnee commented 5 years ago

Yeah, but note that your tests would need to include a subclass of MiscChecker that overrides checkA and checkB, and confirms that the behaviour can be swapped out.

ScopeyNZ commented 5 years ago

Given the typing is that really necessary? I don't feel like there's any value in testing a subclass can extend.

In the context of testing I also want to mention that private methods shouldn't necessarily be tested directly and should instead have coverage from testing their consumers.

dnsl48 commented 5 years ago

My only concern with option A is that we're basically adopting a large new API surface into our semver commitments, which were previously not part of it.

There are two sides of it. One side is - we were not committing to it, the other side is - users aren't really aware of that. That makes it an awkward situation where users rely on things we may be breaking without much concern. That can make project upgrades really painful for them.

It wouldn't surprise me if ss5 ends up making a lot of protected methods private. It'll be cleaner but less extensible.

I think the result could be that we might end up with more stable APIs, more explicit commitments towards somewhat reduced, but better designed surface. On the other hand, the ones who want to live on the edge and is fine to use internals, can always do it through reflection. Really depends on how we define "extensible".

What I'm wondering is, what are the conditions that would make us choose protected over private?

I think in general that should be derived from the rules we apply making things public. E.g.

when the method implements a particular subset of the class responsibilities (e.g. its not a utility method)
when the method doesn't make sense out of the class context and useless without the other class methods
Its override doesn't break SOLID (open/closed principle, has a single responsibility etc)
when the author is certain the signature is stable enough and is not gonna change in patch/minor releases

Can come up with more ideas, but I reckon it heavily depends on and may vary in different cases.

It seems like the decision to make a method public is done out of necessity and design, but protected is done out of altruism. "I don't need to do this, and it makes things harder, but it might help somebody."

Writing "I don't need it, but it might help somebody somewhere" and without really thinking about the use cases which we're trying to cover by making it protected we usually won't be writing tests for those use cases. That produces methods that we have to maintain, but those have no good test coverage, neither they are well thought of, but made "just in case" - those are poorly designed APIs without test coverage.

robbieaverill commented 5 years ago

Ping @silverstripe/core-team in case anyone has missed this

sminnee commented 5 years ago

Given the typing is that really necessary? I don't feel like there's any value in testing a subclass can extend.

Then you've got an API that's not being tested, which seems dodgy to me. If we're hoping for the best regarding our extensibility APIs then I'd say you're really voting for Option C but just don't want to admit it :-P

robbieaverill commented 5 years ago

I think usually the expectation is that protected methods are tested via public methods that use them

ScopeyNZ commented 5 years ago

Yep. The "risk" you have without testing a subclass works is that the protected method you've overloaded might be no longer called when using the public API. I think this is fine though.

sminnee commented 5 years ago

Yeah you guys are voting for Option C.

ScopeyNZ commented 5 years ago

Haha but I voted for option A 😉 . I don't think we should be allowed to change our protected API surface. I don't think anything I said indicates otherwise.

micmania1 commented 5 years ago

I'm just going to reassert that I don't think a blanket decision is sensible.

Bespoke projects If you're developing a bespoke project, it makes sense to have "private as default" since you can easily make it protected if the need arises. Only your project depends on it, so you're not going to get angry developers asking "why is this not protected?!?!?1". It makes it much easier to refactor/remove private methods since you know where they're being used and therefore that they have nothing relying on them.

Framework Development Developing a framework cannot be done with the same mindset. Private by default doesn't make sense because you're designing APIs to be consumed - private isn't consumable. However, making protected the default doesn't make sense either since you may not want something to be consumable (ie. framework internals). Protected also makes things harder to refactor which is why decisions must be on a case by case basis.

Trust At some point you need to trust the people contributing. If they've made something protected/private, find out what they were thinking. It's fine to disagree with them, and I think its important to set the expectation that core contributors have the final say on these things (they have to maintain it after all).

Testing I'm not sure I get the point about testing protected/private methods directly. You should be able to test your protected/private methods through public interfaces and prove that they're tested with code coverage. If you need to test something that's hidden in a private/protected methods, then it's probably a good sign that you need to refactor/abstract things out into another class.

ScopeyNZ commented 5 years ago

Protected also makes things harder to refactor which is why decisions must be on a case by case basis.

I agree with this statement. I also think that we probably err too much on the side of "someone might want to extend this" without thinking about the maintenance strain this adds, because we've always been flakey on the semver obligation on protected APIs. This is why I'm against option C. To this effect I'm not sure if we need a "vote 2" for the "default" visibility.

I think the best code to consume (as a framework consumer / bespoke dev) is code that is explicit with well segmented code where I can feel safe implementing interfaces and overloading classes using standard PHP techniques to produce some code that does what I want it to do, with minimal bespoke effort. I completely understand that this is a codebase that doesn't necessarily hit that mark with some of it's older code. If there was a way to say "option C but all APIs from now are option A" then I'd be all for that. As it stands I feel like it's better to bite the bullet, say "option A" and then either mark a lot of "risky" API in SS5 as @internal, switch it's visibility to private, or just flat out refactor/remove it.

dnsl48 commented 5 years ago

It looks like the Option A is the winner, with just a single vote advantage

Option A: protected scope IS public API.

Protected methods and properties MUST NOT receive incompatible changes
in minor and patch releases.

Core Committer's votes:

Option A (6 votes): @ScopeyNZ @stevie-mayhew @maxime-rainville @tractorcow @wilr @dhensby Option C (5 votes): @sminnee @unclecheese @kinglozzer @robbieaverill @flamerohr

The audience sympathy goes to the Option A with 3 votes. No community members voted for the Option C (only core committers).

Option B is clearly a no go, which means protected is definitely a part of our public APIs.

The main difference between A and C is that with C core committers are allowed to ignore semver for protected scope when they decide it's justified, although they willingly try to avoid that.

Please, don't hesitate to let me know if I should add/amend some pros-cons for any of the items below.

Vote 2: Approach for adding new public APIs

`Option A`: Case by case (:rocket:)

No defaults. Contributor decides what's public/protected/private when writing the code and then peer reviewer confirms.

+:
- Allows actual feature authors decide what should be API and what not
- API will cover real life use cases which challenged feature authors to write the code
-:
- Less control over what makes it to the public API
- 3rd party contributors may ignore use cases they don't care about. As such some APIs may become too specific rather than generic

`Option B`: Public over Protected over Private (:heart:)

Everything should be extensible by default, unless there are some specific requirements not to do so. All private attributes should have protected or public accessors. Private methods are prohibited or should be clearly explained by the documentation why it couldn't be public or protected.

+:
- Everything is extensible
- Easier monkey-patching for core components
-:
- Introduces some anti-patterns
- Not SOLID. Breaks open-closed principle
- APIs become harder to comprehend, implementation details are a part of public API
- Everything is Public API, harder to maintain, almost impossible to follow semver

`Option C`: Private over Protected over Public (:tada:)

Everything should be private by default unless the core team explicitly decides to make it public API. Nothing becomes public or protected unless there is a test written that covers a use case which requires that functionality.

+
- Reduces scope for maintenance
- Only fully tested APIs and proven by real use cases make it to the public scope
-
- The most rigid option. Makes framework less extensible, core components harder for monkey-patching
- Requires more decision-making of the core team. Every new public thing should be explicitly approved.

ScopeyNZ commented 5 years ago

I really don't see how anything other than A makes sense here... Just slapping private on everything without thinking is pretty unrealistic.

sminnee commented 5 years ago

You can always turn something private into something protected/public later. My general view is "private unless you're happy to add tests/docs for the API you've created". Which means, "if in doubt, start private"

"Slapping" is the correct verb for protected/public.

dnsl48 commented 5 years ago

If you didn't have a chance to vote yet, please give it a go. The second vote is still open. We need at least 2 more votes to finish this RFC.

unclecheese commented 5 years ago

What brought this to RFC in the first place was a comment on a PR saying a method should be private, which led to a disagreement, which led to us saying we should standardise this so we don’t waste time bickering on case-by-case assessments.

So if we end up with Option A, have we made any substantive progress?

dnsl48 commented 5 years ago

if we end up with Option A, have we made any substantive progress?

I reckon it's a "yes", because one of the main questions was how do we treat protected scope? is it our public API at all?. That has been discussed above and clarified by the Vote 1.

The second vote is rather to clarify whether we want a "default" for controversial methods or not. The ones which are added to public API with reasoning I don't need it, but just in case, maybe someone will use it for something.

Offtopic:

What brought this to RFC in the first place was a comment on a PR saying a method should be private,

IIRC that was vise-versa: a private method should have become protected

silverstripe / silverstripe-framework