jspecify / jspecify

An artifact of fully-specified annotations to power static-analysis checks, beginning with nullness analysis.
http://jspecify.org
Apache License 2.0
429 stars 26 forks source link

TYPE_USE annotations in TYPE_PARAMETER locations [working decision: nullness-inapplicable] #60

Closed kevin1e100 closed 2 years ago

kevin1e100 commented 4 years ago

We have so far stayed away from allowing annotations in TYPE_PARAMETER positions at all. In part that's because it's not strictly necessary when only tracking upper bounds, and for @Nullable, it seems a lot clearer what <T extends @Nullable Object> means than <@Nullable T>.

There's also the issue of what happens when both type parameter and bound are annotated. We don't need to answer that question when disallowing annotations directly on type parameters. For consistency, and to again sidestep this problem, we've also opted to rule out annotations on a wildcard (?) itself (even though they're considered TYPE_USEs).

By contrast checkerframework assigns a particular "lower bound" semantics to annotating type parameters, which allows defining type parameters that cannot instantiated with @NotNull simply with <@Nullable T>. I'm not sure how <@NotNull T> is interpreted by checkerframework. Either way, except with ? super wildcards, lower bounds don't need to be explicitly tracked when type parameters can't be annotated.

On a more technical note, javac makes it a lot harder to keep track of TYPE_PARAMETER annotations than annotations on type parameter bounds (because the latter are (sometimes) propagated into javac's Type objects), so excluding TYPE_PARAMETER technically simplify things a bit.

Note that support for TYPE_PARAMETER can seemingly be added later. For now disallowing them avoids inconsistencies with how checkerframework (and possibly other tools) would interpret annotations on type parameters.

stephan-herrmann commented 4 years ago

Protocol from 2019-08-01 mentions (when talking about annotating upper and lower bounds at once):

claim that "I cannot accept a collection into which I must be able to put null"

  • Isn't that foo(Collection<? super @Nullable Foo>) ?
  • (yes, that should work)

(Isn't there one negation too many in the claim?)

Anyway, by this construct you'd loose ability to ensure that elements in that collection are indeed Foo. So how to express

Similar situation involving a type parameter:

Type parameters cannot define a lower bound, so the above cannot be expressed, unless we allow <@Nullable T>.

cpovirk commented 4 years ago

Protocol from 2019-08-01 mentions (when talking about annotating upper and lower bounds at once):

claim that "I cannot accept a collection into which I must be able to put null"

  • Isn't that foo(Collection<? super @Nullable Foo>) ?
  • (yes, that should work)

(Isn't there one negation too many in the claim?)

I was not at the meeting, and I'm looking at it that same way. I think the idea may have been that someone said "It's not possible to express x," and then someone said "You can express x by doing the following." So I think the claim is:

"My method is going to put Foo instances and possibly null into a Collection. What should its parameter type be?"

To which the proposed answer is:

foo(Collection<? super @Nullable Foo>)

Anyway, by this construct you'd loose ability to ensure that elements in that collection are indeed Foo.

Agreed. I think the idea is that that doesn't matter, though, since our method is only writing to the collection. It's similar to the line of thinking behind PECS in today's pre-nullability world.

So how to express

  • "I'll accept Collection<@Nullable Foo> as well as Collection<@Nullable SubclassOfFoo>"?

I think it depends on what the method is trying to do. I'd again split it along producer/consumer lines:

Is that good enough, or are there other cases that are missing?

Similar situation involving a type parameter:

  • "I need to be able to return null from any of my methods declaring T as their return"

Type parameters cannot define a lower bound, so the above cannot be expressed, unless we allow <@Nullable T>.

I think the idea here is that all the methods are declared with @Nullable on them:

interface Cache<V extends @Nullable Object> {
  @Nullable V put(Key k, V value);
  @Nullable V get(Key k);
}
stephan-herrmann commented 4 years ago

Is that good enough, or are there other cases that are missing?

It doesn't allow me to specify what I had in mind. Put differently: consider in old code we had:

foo(List<? extends Foo> c)

and now with nullness in mind, I want to be explicit: "I can only accept such lists into which I can add null" (just imagine that foo needs to null-out elements in the list without disturbing existing positions). This may look like a weird method but it is legal Java. With the proposed scheme of annotations it will be impossible to annotate & check. Citing that best practice strictly distinguishes between consumers and supplies doesn't help to deal with such existing code. If we design annotations to be useful only when code already adheres to certain best practices this would be a severe restriction.

OTOH, the method can be perfectly checked if annotated as

foo(List<@Nullable ? extends Foo> c)
stephan-herrmann commented 4 years ago

I think the idea here is that all the methods are declared with @Nullable on them:

This is a technical solution but fails to express my intention. I hold that saying "this generic class cannot be instantiated with a nonnull type argument" is a fair statement, and would prefer we allow to express this using annotations.

cpovirk commented 4 years ago
foo(List<? extends Foo> c)

and now with nullness in mind, I want to be explicit: "I can only accept such lists into which I can add null" (just imagine that foo needs to null-out elements in the list without disturbing existing positions).

I see. The closest signature we can actually express is probably...

foo(List<@Nullable Foo> c)

...but that places a restriction that the method doesn't really need and that might interfere with users' ability to call the method.

I personally don't remember examples of a method like this offhand, so it would be helpful if anyone knows of additional cases in which it's come up. We've already been fairly aggressive about leaving out features that we know come up only moderately often (like possibly leaving out @PolyNull, at least for now), so more demand would help motivate the extra feature.

I think the idea here is that all the methods are declared with @nullable on them:

This is a technical solution but fails to express my intention. I hold that saying "this generic class cannot be instantiated with a nonnull type argument" is a fair statement, and would prefer we allow to express this using annotations.

Interesting. I don't have a mental model for this one, either.

On the one hand, it could be more convenient to write Foo<Bar> (and then have @Nullable tacked on by all the method signatures) than have to write Foo<@Nullable Bar>.

On the other hand, if I already have a @Nullable T type parameter in scope, then I'd rather write Foo<T> than have to write Foo<@NotNull T> (#4 again :)).

But those seem like pretty minor concerns in comparison to matching the user's intention or mental model. Does anyone have examples of types like this handy?

[very very late edit: Maybe the method can switch to using a type parameter instead of a wildcard? <T extends Foo> void foo(List<@Nullable T> c)?]

[additional very very late edit: Let's call this method replaceEmptiesWithNulls, with signature <T extends CharSequence> void replaceEmptiesWithNulls(List<@Nullable T> list).]

[additional very very late edit: Given my previous edits, it sounds like we might be able to fix nullness while varying the base part of the type. Or at least we can do it in the case of extends, but perhaps we can't with super? Maybe we can come up with an example that uses super? There's also another contract that we might want: fixing the base part of the type while varying the nullness. I don't think that's possible under any proposed model -- at least until we have @PolyNull or contracts.]

stephan-herrmann commented 4 years ago

We've already been fairly aggressive about leaving out features that we know come up only moderately often (like possibly leaving out @PolyNull, at least for now), so more demand would help motivate the extra feature.

Not including an additional annotation, the relevance of which we are not yet convinced of, is one thing - can still be added at a later point. I'm fully with you.

Defining the semantics of core annotations in a way that makes certain things impossible, is a different animal. Here arguing that users probably don't code like this will quite certainly get you into trouble. The inventiveness of users in combining language features is unbounded.

Now, what is a "new feature" and what is part of the "core semantics"? I think, as soon as we start allowing to define nullness constraints on how type parameters can be instantiated (which we do by allowing (explicit/implicit) annotations on type bounds), then we need to deliver a complete solution for this task, otherwise we'd deliver a half-feature, which is tremendously difficult to fix later without breaking things.

cpovirk commented 4 years ago

Your distinction between "new feature" and "core semantics" SGTM. I'm personally comfortable delivering a not-100% solution even for core semantics, even with the understanding that we can never fix it. Or at least I'm comfortable with that in the abstract :) If there are enough use cases for TYPE_PARAMETER annotations -- or for other kind of fundamental expressiveness -- then that could change.

I agree that some users will encounter these use cases eventually, and my hope is that they can find workarounds and suppress as needed:

List<? extends @Nullable Foo> foos = ...;
@SuppressWarnings({"unchecked", nullness"})
List<@Nullable Foo> castFoos = (List<@Nullable Foo>) foos;
foo(castFoos);

That's ugly, but we expect for users to have to suppress warnings sometimes, anyway, such as for legacy code. That puts them at risk of NullPointerException, but we already expect them to be at risk sometimes -- for example, because we define Map.get as returning @UnknownNullness for convenience.

Our our requirements doc (which talks about "soundiness") mentions the motivations of legacy code and convenience. But the main thing for me here is the goal of simplicity. Comparing what we're introducing for nullness to what Java 1.5 introduced for "regular" generics, I expect that we'll be putting a lot on users' plates. (We're also putting a lot on our own plates as spec-writers and implementers -- and on the plates of other language and tool owners, too.) It's tough to measure whether it's more important to allow <@Nullable T> (for those users who need it) or to forbid it (for those users who will write it by mistake). We both have our sense of which is more important. Maybe, as we look for more examples (including as we annotate real-world libraries), we'll converge. Or the opposite :)

kevin1e100 commented 4 years ago

@wmdietlGC may be able to provide some perspective here, since checkerframework supports must-be-nullable type parameters as mentioned. I think I agree that it would be great to have data to inform this point--which prototyping should provide--but if people have concrete use cases they can point to already I'd love to see them.

FWIW, this particular feature, must-be-nullable type parameters, I believe could "safely" be added later, if we were to adopt checkerframework's notation for that. That's because, IIUC, checkerframework accomplishes this using annotations on type parameters and wildcards (as opposed to type parameter and wildcard bounds), both of which we have "forbidden" [aka, "reserved"] for this very reason. Werner, please keep me honest here if I'm missing something. Of course, there may be other, better ways of expressing this that would be harder to add later.

stephan-herrmann commented 4 years ago

This issue arose out of the discussion in #12, and is also connected to #18, #31 etc. pp. I'm having difficulties to imagine how we can achieve an agreement in one issue while postponing the other, they're too closely connected.

stephan-herrmann commented 4 years ago

CF and Eclipse have about 6 years of experience shipping a solution that includes the option discussed here. Before we should be discussing this, can anyone explain in plain words, how this approach would possibly be doing harm?

kevinb9n commented 4 years ago

It will take us some time to finish assembling a complete picture of all the harms and benefits of a design decision like this, including everything you learned from those 6 years of experience. However, to build a real consensus for the ultimate design we do have to do that work of understanding all the trade-offs, and I don't think giving past solutions the presumption of acceptance will help us get there. If past decisions were the right ones, we'll arrive at them again!

If anything I just said is concerning, I would like to discuss it sooner rather than later and understand the concerns.

But just to give a more casual but immediate response to your question of "can anyone explain what would be the harm", I had two reactions:

  1. Java never allows lower bounds on type parameters, does it?

  2. The example of a method that wants to read from a collection and must write nulls (and only nulls!) to it seems quite contrived. (I mean this in a friendly way :-)) I've never seen a method like this in my life, and so it seems like just the kind of thing @SuppressWarnings was made for.

kevinb9n commented 4 years ago

I'd also like to address the broader issue of how we value use cases that don't follow "best practices".

My current thoughts are:

  1. It would be wrong to devalue those use cases simply because they don't follow best practices.

  2. On the other hand, sometimes practices become "best" for good reasons. In particular, sometimes violating the best practice is already well known to produce generally bad outcomes. How much weight we should place on that use case compared to others is, I think, a fair philosophical question for us.

I personally happen to have been convinced over the years that the "producer-->extends, consumer-->super, both-->neither" rule is one of these; if you don't follow it, you're going to have a bad time; I've seen it too many times to count. So, to me, if making sure we don't give you an even worse time is free, then fine, let's be nice. In this case I think it is complex and confusing and violates Java precedent (no lower bounds on type parameters).

cpovirk commented 4 years ago

I think the idea here is that all the methods are declared with @Nullable on them:

This is a technical solution but fails to express my intention. I hold that saying "this generic class cannot be instantiated with a nonnull type argument" is a fair statement, and would prefer we allow to express this using annotations.

Does anyone have examples of types like this handy?

Hey, I have one! Guava's ArrayTable is a Table that always has the ability to contain null values. That's expressible as:

class ArrayTable<R, C, V> implements Table<R, C, @Nullable V>

But it's weird for an ArrayTable<R, C, V> to not be a Table<R, C, V>! I could see wanting to write it like this:

class ArrayTable<R, C, @Nullable V> implements Table<R, C, V>

Now, ArrayTable is already weird, and users already get confused by it, so arguably it's "not best practices," and I wouldn't want to bend over backward for it. But I'm happy to have an example of this that resonates with me.

cpovirk commented 4 years ago

Another similar example:

class Equivalence<T> implements BiPredicate<@Nullable T, @Nullable T> {

It's weird that an Equivalence<Foo> isn't a BiPredicate<Foo>. I don't know that it's a "problem" -- it makes sense -- but it's definitely going to surprise someone.

I'm also seeing another issue, which might be more of a "real type-system problem" but which comes up only in private API:

class Wrapper<T extends @Nullable Object> implements Serializable {
  private final Equivalence<? super T> equivalence;
  ...
}

This is Equivalence.Wrapper. Users should be able to create a Wrapper<@Nullable String> backed by an Equivalence<String>. But I'm not sure how easy that will be to express without suppressions. Possible options:

(As you might guess, I've started tinkering with annotating Guava. I'll need a long-term plan before I do anything big, but even tinkering has turned up a few interesting examples.)

[edit: and another example: SingletonNullIterator in guava#7181]

cpovirk commented 4 years ago

Some more things that many of the rest of you already knew:

(1) Even if our annotations don't have @Target(TYPE_PARAMETER), they can still be applied in the form class Foo<@Nullable T>. That's because that position is covered by TYPE_USE. This was not clear to me from 9.6.4.1 (cited by ElementType) because the appropriate section is actually 9.7.4 (via). This location feels similar to class @Nullable Foo, and, unsurprisingly, our proposal currently explicitly disallows both.

(It looks like adding TYPE_PARAMETER might not even change the bytecode generated by class Foo<@Nullable T>. If so, then the purpose of TYPE_PARAMETER seems to be cases in which you want only type parameters to be annotated, not type uses in general. Declaring @Target(TYPE_USE, TYPE_PARAMETER) seems to be redundant.)

We can of course still ban this usage in our checkers (and ignore it in separately compiled code); it will just have to be a manual check, not something that's implied by our @Target.

(2) TYPE_USE is likewise sufficient for <@Nullable ?>. (In fact, it's necessary, since there's no ElementType.WILDCARD corresponding to ElementType.TYPE_PARAMETER.) I am getting the impression, though, that Foo<@Nullable ?> is not as thoroughly battle-tested as the rest of the type-annotations system. See JDK-8233945 and https://github.com/typetools/checker-framework/issues/2880 (which speculates that maybe javac should issue a warning but doesn't).

[much later edit as a reminder to myself if we come back to this: Anecdotally, we seen some user confusion between Foo<@Nullable T> foo; and class Foo<@Nullable T> -- and, for that matter, class Foo<T extends @Nullable Object> and class Foo<T>, too. Especially given existing tools' different treatment of some of these constructs, we may consider viewing this confusion as a reason to avoid assigning semantics to class Foo<@Nullable T>, at least initially.]

cpovirk commented 4 years ago

(I note that Eclipse has no problem with JDK-8233945. For https://github.com/typetools/checker-framework/issues/2880, it behaves like javac, for better or for worse.)

stephan-herrmann commented 4 years ago

From @kevinb9n 's list of Assumptions:

  1. A type parameter does not embody orthogonal type and nullness information, but benefits from the nullness knowledge embedded in the type hierarchy. Bounds are annotated, not type parameters themselves. This is because [why?] [#60]

While generally TYPE_USE annotations contribute, of course, to the type, and should be propagated as part of the type in any analysis, I don't agree to saying this must be the only way for specifying type parameters. Instead I hold that annotating type bounds is not a sufficient means for expressing constraints on type parameters.

Just stating this here, so we don't take a consensus in this issue as given.

kevinb9n commented 4 years ago

Note: the major flaw I understand with the model I've been subscribed to (NN<U<N, #33 and type-params-have-only-a-type-not-extra-orthogonal-info, #60) is the weirdness of <T extends @Unspec>, which I discuss a possible way to handle in the Nov 27 comment on Issue #76. I think that approach is still in keeping with these models in spirit.

kevinb9n commented 2 years ago

If we prohibit our annotations before a type parameter or wildcard, then are later convinced it's worth supporting, I think our adding it wouldn't directly break any existing code.

I think that's the plan right now. I'm going to close this for now -- and I think if someone wants to reopen it, it might actually be #85 they want to reopen instead?

mernst commented 2 years ago

Prohibiting annotations before type parameters or wildcards will be very burdensome to users, who will have to write a different @Nullable or @NonNull annotations at different locations in their program -- some of which will have to be fully-qualified since two annotations of the same name cannot be imported.

I don't see the benefit of prohibiting annotations on these locations, especially given the user burden.

kevinb9n commented 2 years ago

Thanks for reopening. Let's separate two questions:

  1. Should this project define semantics for @Nullable used in these positions. This is probably captured as #85 more or less.
  2. If we don't, what happens when users put them there anyway?

This issue seems like a fine place to discuss question 2.

My last comment was confused. I'm sorry about that. Some things are coming back to me more slowly than others.

Suppose we simply state some phrasing of, "This annotation has no defined meaning if it is placed before a type parameter or wildcard". Then I think tools that want to ascribe such meaning at least have a couple options open (possibly not great ones):

  1. They could provide their own @Nullable, marked with @Implies(org.jspecify.nullness.Nullable), and specifying additional semantics. They are only strengthening our contract, not contradicting it (as they might be if we were "prohibiting", again, sorry about that).
  2. They could provide a class/package/module annotation saying some form of, "within the annotated scope, types annotated with org.jspecify.nullness.Nullable have the following additional properties unless marked otherwise..." I think this is legitimate as well.
mernst commented 2 years ago

For the first option, I think you mean that users would use a different annotation throughout their source code, and never use org.jspecify annotations.

Then, neither option forces users to have two different @Nullable annotations in their source code. That is, both options address my concern.

For the first option, does that require JSpecify to define an @Implies annotation?

For the second option, I don't think an extra annotation is even necessary. A tool could choose to behave that way always. But, a tool could define such an annotation if it saw value in doing so.

kevinb9n commented 2 years ago

Oh my, I think maybe we understand each other perfectly here! What is this feeling? :-)

Option 1: yes, I became personally convinced that we are not viable without it. I'll check back on that issue to see if there's been movement.

I did leave out your Option 3, only because I would expect us to discourage it to some degree (to what degree being a more fine-grained debate than we'd benefit from having today). But there may be a smooth continuum between Options 2 and 3 anyway: suppose a tool has a config option to treat code as if it has this annotation on every class. And maybe that option becomes the default. Etc.

kevinb9n commented 2 years ago

I think the status remains as:

netdpb commented 1 month ago

Thinking about the original request for a way to specify a collection of a nullable subtype of Foo. Would a generic method like this work, without needing to apply type-use annotations to the type parameter itself?

<T extends Foo> void accept(Collection<@Nullable T> collection);
cpovirk commented 1 month ago

Probably. I think we'd want to tie this back to a specific use case to see if that satisfies its needs. As far as I know, the use cases that we've seen have been satisfied by various formulations, whether <T extends @Nullable Foo> or Collection<? super @Nullable T> or something else. If we do have other cases, we should probably discuss them in issues like #85 (referenced by kevinb9n above) or #145.

netdpb commented 1 month ago

I see that in https://github.com/jspecify/jspecify/issues/85#issuecomment-562255814 you already proposed this idea. Thanks.

netdpb commented 1 month ago

Current decision: Java generally doesn't support lower bounds on type parameters or specifying both lower and upper bounds on wildcards, and there isn't a strong enough case to support them. There are other ways to support the use cases for which this is suggested.

Argument for changing: We need a way for methods to accept parameters like collections of Foo into which they can add null. Allowing Collection<@Nullable ? extends Foo> would support that use case, which is otherwise unsupported.

Timing: This must be decided before version 1.0 of the jar.

Proposal for 1.0: Finalize the current decision. If you agree, please add a thumbs-up emoji (šŸ‘) to this comment. If you disagree, please add a thumbs-down emoji (šŸ‘Ž) to this comment and briefly explain your disagreement. Please only add a thumbs-down if you feel you can make a strong case why this decision will be materially worse for users or tool providers than an alternative.

wmdietl commented 3 weeks ago

Does "unrecognized" here mean the same as "meaningless" there? Possibly allowing the usage on TYPE_PARAMETERs in the future would be preferable to me.

kevinb9n commented 3 weeks ago

As I just proposed here, I believe we should clarify that only type usages can be nullness-applicable, and a type parameter declaration is not a type usage. And I believe this should be set in stone now. Whatever use case we might have in the future would be much more clearly served by a new, better-named annotation.

Terminology: I think (?) we settled on "unrecognized" instead of "meaningless". The idea is that a nullness annotation anywhere that isn't "nullness-applicable" is unrecognized (there are other cases of being unrecognized, too).

netdpb commented 1 week ago

@kevinb9n Are you still šŸ‘€ on this issue? The working decision is that they are inapplicable/unrecognized, whatever the wording.