carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
http://docs.carbon-lang.dev/
Other
32.35k stars 1.48k forks source link

Reconsider `import library ...` #1457

Open nigeltao opened 2 years ago

nigeltao commented 2 years ago

https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/README.md#imports says "The import library ... syntax adds all the public top-level names within the given library to the top-level scope of the current file as private names". IIUC this is sort of like Java's import foo.bar.*.

OTOH, https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/name_lookup.md#unqualified-name-lookup says "Unqualified name lookup in Carbon will always find a file-local result".

IIUC, the two statements are contradictory.

Personally, from my Go experience, I prefer the "Unqualified name lookup" principle: users of type Stamp from package time (package in the Go sense) always refer to it by a qualified name: time.Stamp. Similarly, func Decode from package jpeg is referred to as jpeg.Decode, which is clearly distinct from png.Decode even though in the source code they're both defined as func Decode.

Requiring qualified names means that adding names to a library will not cause any downstream importers (that happen to also use that same name, privately) to stop compiling due to a newly introduced ambiguity.

It also means that IWYU (Include What You Use) tools can work purely on the AST without having to follow imports.

(I'll caveat all of the above that I may be misunderstanding Carbon's package/library distinction and the details of its Name Lookup model).

beauxq commented 2 years ago

I like Python's import syntax.

from jpg import Decode as jpgDecode
from png import Decode as pngDecode

Being able to rename things as they're imported is important.

import long_package_name_that_i_dont_want_to_clutter_my_code_with as x;

x.foo();
nigeltao commented 2 years ago

import long_package_name_that_i_dont_want_to_clutter_my_code_with as x;

Go lets you do that to (with a different, Go-like syntax). Handy when you want to import two packages that are both called util.

chandlerc commented 2 years ago

https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/README.md#imports says "The import library ... syntax adds all the public top-level names within the given library to the top-level scope of the current file as private names". IIUC this is sort of like Java's import foo.bar.*.

OTOH, https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/name_lookup.md#unqualified-name-lookup says "Unqualified name lookup in Carbon will always find a file-local result".

IIUC, the two statements are contradictory.

I think the second simply needs to be updated to reflect a change in our plans here.

Specifically, this was discussed pretty heavily and a decision that changed the behavior was made in #1136 -- see the next to last comment for I think a good summary. I think the design overview reflects this outcome, but the unqualified name lookup hasn't been updated.

Personally, from my Go experience, I prefer the "Unqualified name lookup" principle: users of type Stamp from package time (package in the Go sense) always refer to it by a qualified name: time.Stamp. Similarly, func Decode from package jpeg is referred to as jpeg.Decode, which is clearly distinct from png.Decode even though in the source code they're both defined as func Decode.

Requiring qualified names means that adding names to a library will not cause any downstream importers (that happen to also use that same name, privately) to stop compiling due to a newly introduced ambiguity.

It also means that IWYU (Include What You Use) tools can work purely on the AST without having to follow imports.

(I'll caveat all of the above that I may be misunderstanding Carbon's package/library distinction and the details of its Name Lookup model).

I think the biggest thing I want to emphasize is that this only works for libraries within a package, which are expected to be both very local and part of a fairly uniform namespace. I don't think that has many of the concerns here, but maybe I've misunderstood? Mostly want to make sure we're on the same page about what is actually being considered here.

beauxq commented 2 years ago

Something that I really dislike about C/C++, and what I like that is different in other languages:

#include "a.h"
#include "b.h"

Foo f;  // Where did this `Foo` name come from? I can't tell. This is a problem.

In Python:

import a
import b

f = a.Foo()  # I can see where the name `Foo` comes from. Good.

or

from a import Foo
from b import Bar

f = Foo()  # I can see where the name `Foo` comes from. Good.

For every name I bring into scope with imports, I want to be able to see where that name comes from.

This is recognized as bad practice by the Python community:

from a import *
from b import *

f = Foo()  # Where did this `Foo` name come from? I can't tell. This is a problem.

For every name I bring into scope with imports, I want to be able to see where that name comes from.

beauxq commented 2 years ago

I'm not sure whether I have the Carbon syntax right, but...

import Cpp library "a.h";
import Cpp library "b.h";

var f: Cpp.Foo;  // which import line gave me `Cpp.Foo`? I can't tell. This looks like a problem.

Can we do it like this?

import Cpp library "a.h" as a;
import Cpp library "b.h" as b;

var f: a.Foo;  // better

I want to be able to tell where things came from.

If the former remains in Carbon, I would hope that it immediately gets recognized as bad practice, in the same way that from a import * is recognized as bad practice by the Python community.

chandlerc commented 2 years ago
import Cpp library "a.h";
import Cpp library "b.h";

var f: Cpp.Foo;  // which import line gave me `Cpp.Foo`? I can't tell. This looks like a problem.

Can we do it like this?

import Cpp library "a.h" as a;
import Cpp library "b.h" as b;

var f: a.Foo;  // better

I want to be able to tell where things came from.

I definitely understand the desire to have this behavior, but I think it would be quite challenging to get here.

I think the first challenge here is that things behave differently (perhaps very differently) depending on where you write the code based on how names are imported. The desire for code to retain the same meaning in different contexts is in tension with having very localized names. I think both of these are valuable, but I think we'll have to choose.

The second problem is in the face of templates. While we are supporting templates, we'll want cases where code from b.h needs to find names provided by a.h, and can't rely on the local renaming. For example in C++, this comes up with argument-dependent-lookup. We could hide this in Carbon, causing C++ to use a consistent flat namespace but Carbon not, but I think that would both impact Carbon templates and create some amount of confusion between templates in Carbon and C++.

I'll also point out that this is actually not the topic of this issue (I think), as the point you're making here comes up even without the same-package import library operation. You've pointed it out with two header files in the C++ package, and we can see it with two imports of separate libraries from another package. We could force packages to only have a single imported library to remove this, but this would also force either I think a painful expansion in how many package names folks use, or to significantly more coarse grained dependencies that would hurt large-scale build performance.

nigeltao commented 2 years ago

Foo f; // Where did this Foo name come from? I can't tell. This is a problem.

Yeah, that's exactly the problem I'd like to avoid.

Tangentially (possibly), I'd also like to avoid the "Abseil can't name something ERROR (which would otherwise be the perfectly obvious name) because WinGDI.h says #define ERROR 0" problem.

Nonetheless, in this Carbon snippet:

import Cpp library "a.h";
import Cpp library "b.h";

var f: Cpp.Foo;  // which import line gave me `Cpp.Foo`? I can't tell. This looks like a problem.

I'm willing to let import Cpp (as opposed to import Bar generally) be a special exemption, if it'd otherwise make it too hard to migrate or wrap existing C++ code (and its names).

nigeltao commented 2 years ago

We could force packages to only have a single imported library to remove this, but this would also force either I think a painful expansion in how many package names folks use, or to significantly more coarse grained dependencies that would hurt large-scale build performance.

I'm still not sure if I understand the Carbon package vs library distinction correctly, but my rough model (based on Go experience) is that Carbon package ~ Go module (the coarser unit of software distribution and versioning) and Carbon library ~ Go package (the finer unit of import), with the extra note that Carbon's Cpp package is the special-cased bridge to the enormous soup of existing C/C++ code (and C code doesn't even have namespace).

"ImageMagick / LibMagick" could be the coarse thing and "JPEG codec" or "PNG codec" could be the fine thing.

"The fine thing" is still a coarser concept than C/C++ (where the source code file is the unit of import) or Java (where the class is the unit of import, plus * for globbing).

In Go, modules are a build tool or package management concept but entirely invisible to the compiler and language. Go source code always imports individual Go packages and never mentions Go modules.

If Kubernetes (or one aspect, k8s.io/client-go) is the Go module, here's an example of importing multiple Go packages from within that Go module, and this (multiple imports) hasn't been painful in practice.

There's also a demonstration of locally renaming some Go package names in the link, if you're curious, although if starting the syntax from scratch, I'd put the local name on the right instead.

chandlerc commented 2 years ago

I'm still not sure if I understand the Carbon package vs library distinction correctly, but my rough model (based on Go experience) is that Carbon package ~ Go module (the coarser unit of software distribution and versioning) and Carbon library ~ Go package (the finer unit of import), with the extra note that Carbon's Cpp package is the special-cased bridge to the enormous soup of existing C/C++ code (and C code doesn't even have namespace).

I think you roughly have the right model.

"ImageMagick / LibMagick" could be the coarse thing and "JPEG codec" or "PNG codec" could be the fine thing.

The example I've been using is Abseil might be a package in Carbon.

"The fine thing" is still a coarser concept than C/C++ (where the source code file is the unit of import) or Java (where the class is the unit of import, plus * for globbing).

I think we should be relatively close to where C/C++ is here, where it is the "external header file" that is the import unit. This often corresponds to a single file, but also often corresponds to several files, from an implementation file to internal helpers.

In Go, modules are a build tool or package management concept but entirely invisible to the compiler and language. Go source code always imports individual Go packages and never mentions Go modules.

If Kubernetes (or one aspect, k8s.io/client-go) is the Go module, here's an example of importing multiple Go packages from within that Go module, and this (multiple imports) hasn't been painful in practice.

It's likely this could also work, but I think the shared top-level namespace that libraries like Abseil use in C++ also works quite well.

I pointed out above some of the downsides of needing local renaming, and I think not having the coarse level thing be what carries the unique name would put a lot of pressure on longer names or lots of local renaming. It is a tradeoff, but one that I think we've already seen work reasonably well in C++ (where followed closely), and so it seems reasonable to match that pattern in Carbon.

nigeltao commented 2 years ago

I think we should be relatively close to where C/C++ is here, where it is the "external header file" that is the import unit. This often corresponds to a single file, but also often corresponds to several files, from an implementation file to internal helpers.

By "external header file", do you mean:

nigeltao commented 2 years ago

I pointed out above some of the downsides of needing local renaming

There might be some confusion about whether you were replying about renaming, the as a in

import Cpp library "a.h" as a;
import Cpp library "b.h" as b;

var f: a.Foo;  // better

or replying about dot-dot-dot imports, where, after I import Abseil, I can just refer to a naked Mutex instead of having to qualify it as Absl.Mutex.

Or replying about both. There's also the axis on whether you're replying about all packages or just about the special Cpp package. (Is the Cpp package special or is it valid Carbon to use any Foo in import Foo library "foo.h" when foo.h is obviously C++, not Carbon??)

In case it got lost in the subsequent discussion, my original request is that Carbon programmers must always qualify (it's always Absl.Mutex) unless there's an explicit opt-in (alias Mutex = Absl.Mutex). Renaming packages/libraries (locally, at time-of-import) is related, but separate.

Specifically, templates and ADL (Argument Dependent Lookup) should still work with qualified names (e.g. I can say std::vector<absl::Mutex>), so it wouldn't break them to require qualified names, right?

nigeltao commented 2 years ago

Ah, I though I understood ADL but re-reading up on it...

using std::swap;
swap(obj1, obj2);

is not the same as

std::swap(obj1, obj2)

That's C++. In the Carbon world, if I import Cpp library "algorithm" then I presumably need to get swap into the top-level namespace for the same reason.

But if algorithm was re-written to be pure Carbon (and Carbon generics are not just templates), would import Carborithm also need to put things into the top-level namespace? Again, could we treat import Cpp etc as a special case?

nigeltao commented 2 years ago

I think the biggest thing I want to emphasize is that this only works for libraries within a package, which are expected to be both very local and part of a fairly uniform namespace. I don't think that has many of the concerns here, but maybe I've misunderstood? Mostly want to make sure we're on the same page about what is actually being considered here.

Ah, I missed that, in all the excitement.

That goes a long way to alleviate my concerns. It's closer to the Go model: "local" names are "local to the Go package", not "local to the file". Note though that there are still three levels of granularity:

and "local" in Go means "the bottom 2 out of 3" but IIUC in Carbon means "all 3".

chandlerc commented 2 years ago

By "external header file", do you mean:

  • a file designed for external consumption: this is the .h file that users (as opposed to the people developing that library) should #include? This could be an 'umbrella' file that #includes other 'internal' files, e.g. Harfbuzz's hb.h.

This.

We even expect to have analogs to an "umbrella" library that re-exports things if people desire it, but I think we'd like the default/typical case to be pretty fine grained for better build scaling.

I think the biggest thing I want to emphasize is that this only works for libraries within a package, which are expected to be both very local and part of a fairly uniform namespace. I don't think that has many of the concerns here, but maybe I've misunderstood? Mostly want to make sure we're on the same page about what is actually being considered here.

Ah, I missed that, in all the excitement.

Ah, yes. It's a super important point. =D Without this, much chaos.

For any name not local to your package, you're going to have to use its package name as a qualifier.

That goes a long way to alleviate my concerns. It's closer to the Go model: "local" names are "local to the Go package", not "local to the file". Note though that there are still three levels of granularity:

  • Carbon package / Go module
  • Carbon library / Go package
  • Source code file

and "local" in Go means "the bottom 2 out of 3" but IIUC in Carbon means "all 3".

For names that can be used without a qualifier, yes.

But we do plan to have things that are private to a library, and private to a source file for implementation files.

Now that this is refined/clarified a bit, is there still a question around reconsidering for the leads here?

nigeltao commented 2 years ago

I am still a little confused. I'm not confident that I understand the ADL swap example, but I might just need to stare at it for longer.

Specifically, I'm not sure if it matters that, in Carbon, IIUC now, you can't import library "algorithm" to put things into your file's top-level scope unless your file is also in package std.

Possibly easier said than done, but if swap isn't a good Carbon example of where import library ... matters, do you have a better example (ADL or otherwise)?

nigeltao commented 2 years ago

I'll step back from focusing on ADL and 'what breaks if we don't have import library ...'.


I'll start again. Let me know if I've got this wrong. In Carbon, I can say:

import PkgName library "LibName"

A detail I missed earlier is that the names within become qualified by the package name: it's PkgName.T not LibName.T. This is different from Go, where the qualified name (in practice) is prefixed by the equivalent of the Carbon library name, not the Carbon package name. As mentioned earlier, Go modules (roughly equivalent to Carbon packages) are invisible at the Go compiler / language level.

Whether or not C++ is a special case in Carbon, one reason Carbon does this is that C++ libraries' names weren't chosen with qualification in mind, yet in Carbon we'd like to be import multiple C++ libraries that might refer to each others' names (so we can't just automatically prefix these names with the library name):

import Cpp library "freetype2/ft2build.h"
// HB code refers to "FT_Face", not "freetype2.FT_Face".
import Cpp library "harfbuzz/hb.h"

Another point is that, for same-Carbon-package imports, it's not that the package name can be omitted, it's that it must be omitted.

I had originally thought that the import library ... syntax was for when my program uses an Absl.Mutex but I want to be lazy and just type Mutex. Maybe I want to use 30 different types and functions from the library but I don't want to write out 30 alias lines.

Instead, that syntax is for when another Abseil library wants to import Abseil's synchronization library, but there is no thispackage prefix that is for 'the current package' what this is for 'the current object'.


I still feel that the Carbon equivalent of:

#include "a.h"
#include "b.h"

Foo f;  // Where did this Foo name come from? I can't tell. This is a problem.

is a problem. In abstract terms, I'd still prefer that "when Abseil's time library imports Abseil's synchronization library, there's still some sort of prefix such that the time source code refers to prefix.Mutex". But I don't have a concrete proposal. I can see now that it's not that simple.

nikola43 commented 2 years ago

image I cant build image Anyone knows what is the problem?

chandlerc commented 2 years ago

Sorry for delay in explicitly following up here...

Let me know if I've got this wrong.

I don't think so, your summary sounds right to me.

In abstract terms, I'd still prefer that "when Abseil's time library imports Abseil's synchronization library, there's still some sort of prefix such that the time source code refers to prefix.Mutex". But I don't have a concrete proposal. I can see now that it's not that simple.

Yeah, I can see the desire for this. But I think there are real tradeoffs here as well, and the current structure ended up feeling like a good balance of the different factors.

github-actions[bot] commented 2 years ago

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.