types2: Generics allow creating values of unexported or internal types

Merovius commented 1 year ago

I'm not sure if this is a bug or should be a proposal, or if it is just a curiosity. But I noticed a thing and I think it's weird.

Say package a wants to make sure that of some type, only well-controlled values are ever used (in my use case, its an interface which I want to share between packages but be able to change without breaking compatibility). It uses an internal package to do so:

-- a/a.go --
package a

import "a/internal"

func MakeX() internal.X {
    return internal.Make(42)
}

func F(x internal.X) {
    if x.V() != 42 {
        panic(fmt.Errorf("invalid %#v", x))
    }
}
-- a/internal/internal.go --
package internal

type X struct { v int }

func Make(v int) X { return X{v} }

func (x X) V() int { return x.v }

-- b/b.go --
package main

import (
    "a"
    // illegal: use of internal package a/internal not allowed
    // "a/internal"
)

func main() {
    // doesn't work
    // a.F(internal.X{})
    a.F(a.MakeX())
}

As a/internal cannot be imported, there is no way to get an internal.X that isn't sanctioned by a. reflect can be used to construct an any with dynamic type internal.X, but that can't be type-asserted, so it can't be passed to a.F.

However, using generics, we can do this trick:

package main

import "x/a"

func main() {
    a.F(trick(a.MakeX))
}

func trick[T any](f func() T) T {
    return *new(T)
}

The same also applies if the type is not from an internal package, but an unexported type. I suspect a similar issue would apply to #21498 (there is currently no way for a different package to write a func(unexported), but #21498 would allow it by spelling it (x) => { … }).

I'm not sure how important it is to prevent this. But I don't think this is how it should work. I think if an inferred type argument is from an internal package or is a defined type with unexported name from a different package, it should fail. But doing that would technically be a breaking change.

Just thought I'd bring this up, at least.

mknyszek commented 1 year ago

CC @golang/compiler

randall77 commented 1 year ago

You can also do this workaround using reflect:

    f := reflect.ValueOf(a.F)
    argType := f.Type().In(0)
    arg := reflect.New(argType).Elem()
    f.Call([]reflect.Value{arg})

Merovius commented 1 year ago

True, the call still works using reflect. Hm.

griesemer commented 1 year ago

The handling of internal packages is "external" to the language, it's really the build system that prevents invalid internal imports. So I'd leave that out from the discussion.

But as you say, it doesn't require an internal package. Here's a playground link for the "trick" and reflection-based scenario simply using an unexported type.

It's always been possible to create invalid values of unexported types, without using generics, and without reflection, at least for basic types. Here's an (admittedly pathological) example which creates a zero value for an unexported numeric type because operations such as - are still available.

The upcoming clear built-in (#56351) will zero the elements of a slice, even if the slice elements are of unexported type and there's no other mechanism to create that zero value.

I wonder if the zero value for a type is the only value that can be created this way (ignoring numeric types, where one probably can create a 1, and then all the values, using suitable operations).

~And I wonder whether clear is making the hole bigger.~

~(I think we could arguably fix the type inference rules by treating this as a bug: one should not be able to infer an unimportable type. I don't think there's a substantial amount of code dependent on this.)~

I think we should probably fix the type inference issue: it should not be possible to infer a type that is not otherwise accessible in a package. We have carved out a provision in our backward-compatibility guarantees for errors in the generics implementation.

zephyrtronium commented 1 year ago

And I wonder whether clear is making the hole bigger.

clear only applies when there is a function that returns specifically a map, slice, or array of some unexported element type, otherwise you can't mention b.unexported to construct []b.unexported or map[K]b.unexported. Obtaining the zero value from a map is trivial: delete a key and then access that key (which works even if the key type is also unexported by using a range loop). Non-empty slices and arrays can be used to obtain the zero value by repeatedly appending the first element to the slice (starting with s := array[:] for the latter) until cap > len. The last case to consider is a map with an unexported key type, but there is no (non-reflective, non-generic) way to get the zero value from that.

If clear could zero any arbitrary pointer type, as was at one point proposed, then it would introduce a new way to obtain zero values of unexported types without reflection or generics. Since that capability has been retracted, I don't see any way that clear makes it possible to access zero values that could not be accessed before.

griesemer commented 1 year ago

Good point about maps: one simply needs to access an element with a key that doesn't exist, using comma-ok form, and one gets a zero value. ~Same for type assertions, etc.~ Assuming that the map element type is unexported.

I don't follow your slices example: if the slice is not empty, all we can append are the element (or other) values that already exist. What am I missing?

There's also the case of a channel with elements of unexported type. Once the channel is closed, one receives zero values.

But in all cases a package defining the unexported type can chose to not export a map, channel, slice, or array with that unexported type as element type.

zephyrtronium commented 1 year ago

I don't follow your slices example: if the slice is not empty, all we can append are the element (or other) values that already exist. What am I missing?

We can append the values that already exist until append grows the slice, at which point we have zero values:

for len(s) == cap(s) {
    s = append(s[:cap(s)], s[0])
}
fmt.Printf("%#v\n", s[:cap(s)][cap(s)-1])

https://go.dev/play/p/QrXkeXqvMfn

But in all cases a package defining the unexported type can chose to not export a map, channel, slice, or array with that unexported type as element type.

I agree. But then clear doesn't widen the hole. That's what I meant to show – perhaps more as a defense of #56351 than anything relating to this issue – prior to your edit. 🙂

griesemer commented 1 year ago

Tricky! It seems that the spec doesn't even say that the values past the length of a slice are zero values in case of growth. Hm. (Filed #56684.)

But yes - it looks like clear is in the clear with respect to adding unexpected capabilities. It does not.

neild commented 1 year ago

As a concrete impact, this does make it quite a bit easier to evade the constant string checks in github.com/google/safehtml.

type stringConstant string

// The parameter to ScriptFromConstant must be an untyped string constant.
func ScriptFromConstant(script stringConstant) Script

func callWithInconstantString[S ~string, R any](f func(S) R, s string) R {
  return f(S(s))
}

s := "this is not a constant string"
script := callWithInconstantString(ScriptFromConstant, s)

There are other, less convenient, ways to elude these checks, so this isn't a tremendous problem; the goal of the package is to avoid accidental error, not defend against malice on the part of the programmer. Still, it might support a change to avoid inferring types that the instantiating package can't name.

DmitriyMV commented 1 year ago

Will this code https://go.dev/play/p/ASrjalg4xdr continue to work? I mean such usage of type aliases was not exactly correct, but it worked even before generics.

Merovius commented 1 year ago

@DmitriyMV I think under a realistic interpretation of what we are talking about, no, that would be forbidden. What do you mean by "it worked even before generics"? The part that is in question (calling Count) couldn't have been expressed without generics.

Merovius commented 1 year ago

I guess you mean this? That's a good point and it may speak against this change.

DmitriyMV commented 1 year ago

Sorry, I wasn't clear enough - what I meant is that you could have private types as part of the public one and currently you can work with them, provided that you can find a way to create their instances. You link is actually what I wanted to express, with a bit of modification.

DmitriyMV commented 1 year ago

There is also a question how with this proposal generic code will handle this where you use exported alias for the private types.

Merovius commented 1 year ago

I think ideally, the same restriction would have applied since Go 1 to any inference mechanism. It didn't and it seems too late to change that. So, I guess either we 1. stay consistent and allow to infer with every inference mechanism, 2. stay consistent which might break more than we are willing to, or 3. are not consistent and allow inference with some mechanisms and forbid it with others (where "inference" means "of unexported/internal types").

FWIW for my specific use case I've now added another layer of abstraction:

-- a/internal/internal.go --
func Unwrap[W ~struct{ e E }, E any](w W) E {
    return struct{ e E }(w).e
}

type Storage struct {
    e StorageInterface
}

func WrapStorage(e StorageInterface) Storage {
    return Storage{e}
}

-- a/storage/sqlite.go --
func New(/*…*/) internal.Storage {
    return internal.WrapStorage(&storage{/*…*/})
}

-- a/a.go --
type Server struct {
    // Storage backend to use. Defaults to something kinda sensible.
    Storage internal.Storage
}

func (s *Server) DoThing() {
    storage := internal.Unwrap(s.Storage)
    if storage == nil {
        storage = defaultStorage()
    }
}

I think this should guarantee that only approved implementations can exist, foreign packages can't use any methods on it and that even if someone is trying to circumvent the mechanism, they can at best get a zero value, which means to use default. So, with this I feel comfortable using an interface that can be changed without breaking compat.

Merovius commented 1 year ago

There is also a question how with this proposal generic code will handle this where you use exported alias for the private types.

Hm, I was almost expecting something like this to come up. TBQH, I'm feeling more and more that this is just a lost cause. I wasn't totally sure it's a problem we should solve from the beginning.

gopherbot commented 1 year ago

Change https://go.dev/cl/451220 mentions this issue: go/types, types2: do not infer external unexported types

griesemer commented 1 year ago

@DmitriyMV At the moment, your example would infer the unexported type myInt and fail. But you would be able to provide the type arguments explicitly and make this work. As an aside, it's a (different!) bug that currently the type checker loses information about alias types. The exported type for SetOfInts should really preserve the exported key type and then this would work. This is an alias implementation problem and we shouldn't confuse it with this specific issue.

griesemer commented 1 year ago

I believe this a bug - it was never the intent that type inference would provide this new capability to the language. From the original Type Parameters Proposal, which outlines the generic design:

Note: type inference is a convenience feature. Although we think it is an important feature, it does not add any functionality to the design, only convenience in using it.

In other words, any code that we can write using type inference, we should be able to write without, by explicitly providing all type arguments.

mdempsky commented 1 year ago

In general, exportedness of an identifier is only relevant to type-checking to restrict what identifiers a user can directly type in Go source.

For example, today we allow:

Inferring a nested composite literal type as an unexported type: https://go.dev/play/p/62lo8K1qc7D
Promoting unexported methods across package boundaries: https://go.dev/play/p/S7zKhXLSQZ9
Promoting methods through embedded, unexported fields: https://go.dev/play/p/4F_0z52opcT
Implicit conversion of untyped constants to unexported types: https://go.dev/play/p/LFaD-__dVNY
Implicit conversions of anonymously typed values to unexported types: https://go.dev/play/p/UayBdjH8GeW
Converting T to U even when T/U are struct types with unexported fields: https://go.dev/play/p/0QCHmHPrwW5

In contrast, the only cases where we care about whether an identifier is exported today is in selector expressions (including qualified identifiers) and struct literals (where initializing an unexported field is disallowed, even in key-less literal notation).

Also, as @randall77 pointed out earlier, the main motivating example for this (google/safehtml) can be trivially circumvented with reflection too.

So I'm at least not convinced that restricting inferred type arguments is necessary. I think it's more consistent with the rest of the language that we continue to allow inferring unexported types, even when users can't spell them directly as type arguments.

Finally, CL 451220 doesn't address the original issue report here: that an exported type from an internal package can be inferred. So there are still type arguments that can be inferred, but cannot be directly written in source.

griesemer commented 1 year ago

exportedness of an identifier is only relevant to type-checking to restrict what identifiers a user can directly type in Go source

Agreed. Combine this with the explicitly stated intent that type inference is a convenience feature, and that means that we should not rely on type inference to find such unexported types. If we can't write the code w/o type inference, we shouldn't be able to call type inference to the rescue.

Yes, reflection may be used in some cases, but I don't think that's a sufficiently strong argument. Reflection can be used to circumvent various language protections (and so is the package unsafe).

Internal packages are a language-external concept. There's simply not much we can do about them here. The language doesn't talk about the build system.

DmitriyMV commented 1 year ago

FWIW the more I play with CL, the more I don't think this change is worth it. We already have numerous ways to circumvent visibility rules, starting from var b = funcThatReturnsPrivateThings() or range construct. @mdempsky noted many others, some of them I had no idea that they existed! It looks strange to me, that we try to forbid "private type inference" specifically in generics, but not in another constructs.

Also, as I noted in CL, this change will break legitimate code with aliases. We export types using aliases in tests, and that would mean that we would have to rewrite them.

findleyr commented 1 year ago

Sorry, have been discussing this on the CL while somehow missing the discussion here.

This is an alias implementation problem and we shouldn't confuse it with this specific issue.

I actually think that this is a significant problem that should at least block this change until the compiler has better support for aliases. With this change, the refactoring type X = internalX is all of a sudden a breaking change. All type aliases are potentially breaking, even in libraries that don't themselves use generics. If I am a user calling libraryX with a type from libraryY, my program may break as a result of this change, and I have no way to fix the breakage.

griesemer commented 1 year ago

Fair enough. I agree that the alias issue is a problem - even though unrelated - but it compounds this issue.

Still, if we had raised an error starting with Go 1.18 when unexported types were inferred, such code would not have been written. And we can renege on the backward compatibility promise for design errors with generics.

In contrast to the other features @mdempsky is reporting, type inference explicitly was considered a convenience feature.

mdempsky commented 1 year ago

If we can't write the code w/o type inference, we shouldn't be able to call type inference to the rescue.

But if we allow type inference to continue inferring types from internal packages, then this issue remains present. As you mention, internal packages are technically a build system detail, not a language-level detail; but the majority of Go programmers only know one build system.

As I see it, there are 3 options here:

Do nothing. This is a quirky language feature, but IMO on par with other exportedness quirks like I previously mentioned, and no more powerful than reflection as @randall77 mentioned.
Strictly apply the "type inference is only a convenience" principle by also extending go/types to allow build systems to supply package visibility metadata so that it can reject inferences of both p.internalX and p/internal.X. (E.g., a AllowedToImport func(*Package) bool option on go/types.Config.)
Split the difference: reject p.internalX but allow p/internal.X, and explain to users why these work differently. Also explain why aliases may cause problems.

Personally, I think (1) is the most pragmatic decision. It's the least effort to implement, the easiest to explain, and it doesn't risk breaking any working code.

griesemer commented 7 months ago

We should make a final decision here - most likely that we can't change things. But not urgent for 1.22. Moving along.

golang / go

types2: Generics allow creating values of unexported or internal types #56669