make `usingnamespace` not put identifiers in scope

andrewrk commented 3 years ago

From status quo the proposed change is very small:

Keyword is renamed from usingnamespace to includenamespace.
It only mixes in to the set of declarations for a given namespace; it does not affect identifiers.

Example:

pub const Foo = struct {
    includenamespace Bar;

    test {
        _ = hello; // error: use of undeclared identifier
        _ = Foo.hello; // OK!
    }
};

pub const Bar = struct {
    pub const hello = "hello";
}

Why? What material benefit does this have? Turns out, it's actually a big deal for the design of an incremental compiler:

AstGen is the place where we report shadowing of identifiers and unused identifiers. It would make sense to also report error for use of undeclared identifier. However the existence of usingnamespace makes that not an option. Having this error in AstGen means that the compile error is available non-lazily. That is, you will see errors of this form even for code that does not get semantically analyzed. For example, on Windows you would still see "use of undeclared identifier" errors when building for macOS.
With status quo semantics, an identifier in scope of more than one usingnamespace forces all of them to be resolved, in order to make sure there are no name conflicts. However, using a.b syntax means only the includenamespace declarations that apply to a must be resolved. With incremental compilation, having an identifier force an unrelated usingnamespace to be resolved creates a dependency between these two things. This dependency costs perf, memory, and decreases the amount of encapsulation of changes - that is, it makes an incremental update less incremental. The includenamespace semantics reduce the amount of such dependencies.

ifreund commented 3 years ago

This proposed change also has the nice effect of removing an antipattern from the language:

// The idiomatic way to import the standard library
const std = @import("std");
std.foo();

// A tempting pattern to avoid typing "std" everywhere
// This would no longer be possible with this proposal!
usingnamespace @import("std");
foo();

michal-z commented 3 years ago

Currently I'm using this code:

usingnamespace @import("dxgi.zig");
usingnamespace @import("dxgi1_2.zig");
usingnamespace @import("dxgi1_3.zig");

To bring some constants into the scope (all constants are prefixed with DXGI_, for example DXGI_USAGE_SHARED).

After this change I will have to do something like this:

const dxgi = @import("dxgi.zig");
const dxgi1_2 = @import("dxgi1_2.zig");
const dxgi1_3 = @import("dxgi1_3.zig");

_ = dxgi.DXGI_CONSTANT_1;
_ = dxgi1_2.DXGI_CONSTANT_2012;
_ = dxgi1_3.DXGI_CONSTANT_323232;

Am I correct?

andrewrk commented 3 years ago

Yes you are correct. However if you control dxgi.zig and the other files then you could improve it by using real namespaces like this:

const dxgi = @import("dxgi.zig");
const dxgi1_2 = @import("dxgi1_2.zig");
const dxgi1_3 = @import("dxgi1_3.zig");

_ = dxgi.CONSTANT_1;
_ = dxgi1_2.CONSTANT_2012;
_ = dxgi1_3.CONSTANT_323232;

ifreund commented 3 years ago

Note also that when interfacing with C code (which usually uses name prefixes as C lacks language-level namespacing), it is common to have a c.zig file per project containing a @cImport() and related bindings. This proposed change would not negatively affect this pattern:

// in c.zig

pub includenamespace @cImport({
    @cDefine("_POSIX_C_SOURCE", "200809L");

    @cInclude("stdlib.h");
    @cInclude("unistd.h");
});

pub const DXGI_CONSTANT_1 = 42;
pub const DXGI_CONSTANT_2 = 31415;

// in main.zig

const c = @import("c.zig");

c.libfoo_do_bar(c.DXGI_CONSTANT_1);

Using the single character prefix "c" keeps it clear where these declarations come from without being overly verbose.

michal-z commented 3 years ago

I will try to merge dxgi.zig, dxgi1_2.zig and dxgi1_3.zig into one dxgi.zig file. Then I will have:

const dxgi = @import("dxgi.zig");

_ = dxgi.CONSTANT_1;
_ = dxgi.CONSTANT_2012;
_ = dxgi.CONSTANT_323232;

Will see how it works.

michal-z commented 3 years ago

One issue I see with this is:

In C, I can have one 'namespace' spread across several files. For example, DXGI_ 'namespace' is spread across files: dxgi.h dxgi1_2.h dxgi1_3.h dxgi1_3.h dxgi14.h dxgicommon.h dxgitypes.h All identifiers are prefixed with DXGI in above files.

This proposal forces Zig to have one namespace per file. I won't be able to have single huge namespace spread across several files.

ifreund commented 3 years ago

This proposal forces Zig to have one namespace per file. I won't be able to have single huge namespace spread across several files.

I would argue that if you want to break your namespace up into smaller logical chunks and split those between files, that logical division of the namespace should be expressed through sub-namespaces.

However, what you want to do is still very much possible with the proposed changes:

// in dxgi.zig

pub includenamespace @import("dxgi1_2.zig");
pub includenamespace @import("dxgi1_3.zig");
pub includenamespace @import("dxgi1_4.zig");
pub includenamespace @import("common.zig");

I would still recommend avoiding a single massive namespace though. Instead consider providing a bit more structure to your api through nested namespaces, similar to how the standard library has std.fs, std.os, std.net, etc instead of sticking everything directly in the std namespace.

michal-z commented 3 years ago

// A tempting pattern to avoid typing "std" everywhere
// This would no longer be possible with this proposal!
usingnamespace @import("std");

@ifreund Not sure I understand this. This won't work?

includenamespace @import("std");

But below will?

// in dxgi.zig

pub includenamespace @import("dxgi1_2.zig");
pub includenamespace @import("dxgi1_3.zig");
pub includenamespace @import("dxgi1_4.zig");
pub includenamespace @import("common.zig");

ifreund commented 3 years ago

@michal-z includenamespace @import("std"); will "work" in that it will expose all public declarations of the std from the type doing the includenamespace. However it will not make those declarations available in the current scope as the old usingnamespace does.

// old
const Foo = struct {
    usingnamespace @import("std");
    pub fn foo() void {
        function_in_std();
    }
};

// new
const Foo = struct {
    includenamespace @import("std");
    pub fn foo() void {
        // Would be a compile error without the Foo.
        Foo.function_in_std();
    }
};

michal-z commented 3 years ago

@ifreund Thanks. And how is function_in_std() visible outside of the Foo?

As: _ = Foo.function_in_std(); or as: _ = Foo.Foo.function_in_std();

I must say that this is a bit confusing for me.

andrewrk commented 3 years ago

Keep in mind this proposal effectively only changes one thing: not putting identifiers in scope. Everything else works the same.

I'm not sure how this question would come up, unless you also have the same question about status quo usingnamespace.

ifreund commented 3 years ago

@ifreund Thanks. And how is function_in_std() visible outside of the Foo?

It would be visible as Foo.function_in_std() within the file and not visible outside the file.

If pub includenamespace was used instead, it would be visible as Foo.function_in_std() from other files as well.

As andrew points out, these semantics are exactly the same as those of the current using namespace.

michal-z commented 3 years ago

I see, all is clear now, thanks for explanation!

michal-z commented 3 years ago

@ifreund @andrewrk I know where my confusion came from, in status quo:

const Foo = struct {
    usingnamespace @import("std");
    pub fn foo() void {
        debug.print("aaa", .{}); // works
        Foo.debug.print("aaa", .{}); // also works
    }
};

With this proposal:

const Foo = struct {
    includenamespace @import("std");
    pub fn foo() void {
        debug.print("aaa", .{}); // does not work
        Foo.debug.print("aaa", .{}); // works
    }
};

The fact that both forms work in status quo was a bit confusing for me.

ifreund commented 3 years ago

While I think the proposed semantics are solid, I think the new keyword name could use some more consideration.

The currently proposed includenamespace seems a little ugly to me as not everything in the target namespace is included, only the public declarations. Therefore I propose using includedecls or includedeclarations instead. There is precedent for abbreviating "declaration" as "decl" in the @hasDecl() builtin. We could also shorten this to just include, but I like the increased clarity of includedecls.

I think using mixin as the new keyword is also worth serious consideration. Mixins are already a well known concept in programming (wikipedia) and I believe that the proposed includenamespace semantics map quite well to the generally understood concept of a mixin.

michal-z commented 3 years ago

@ifreund I really like includedecls. mixin sounds strange to me - I think it is not widely used in programming and will be confusing for many people.

marler8997 commented 3 years ago

Would this still work with methods?

const Foo = struct {
    pub includenamespace struct {
        pub fn bar(self: *Foo) { ... }
    };
};

var foo = Foo { };
foo.bar();

Also would self.bar() work inside the definition of Foo? I could use a bit more clarity on the distinction that's being made between "namespace" and "scope/identifiers".

marler8997 commented 3 years ago

AstGen is the place where we report shadowing of identifiers and unused identifiers. It would make sense to also report error for use of undeclared identifier. However the existence of usingnamespace makes that not an option. Having this error in AstGen means that the compile error is available non-lazily. That is, you will see errors of this form even for code that does not get semantically analyzed. For example, on Windows you would still see "use of undeclared identifier" errors when building for macOS.

Yeah this is actually a pretty big deal. I recall working out proposals to enhance the D language in a way that would allow projects to use more explicit or namespaced imports that would allow those projects to get the same benefit. This enables the compiler to rule out having to analyze other modules, which was killing compilation performance in D.

Here's that D proposal where I walk through what the problem is for those interested: https://github.com/marler8997/dlangfeatures#lazy-imports

Here's a preview:

In order to delay loading a module, you need to be able to determine which symbols belong to it without loading it. Unfortunately, this doesn't work with normal imports because they "mix" their symbols into the current scope

ifreund commented 3 years ago

Would this still work with methods?

@marler8997 yes, both of those examples in your snippet would continue to work, just as they currently do with usingnamespace. The only thing that would no longer work would be using bar() directly:

const Foo = struct {
    pub includenamespace struct {
        pub fn bar(self: *Foo) void { ... }
    };

    fn init(self: *Foo) void {
        self.bar(); // still works
        bar(self); // would no longer work
    }
};

var foo = Foo { };
foo.bar(); // still works

andrewrk commented 3 years ago

@ifreund I'm happy with the name mixin - but this comment got me thinking:

The currently proposed includenamespace seems a little ugly to me as not everything in the target namespace is included, only the public declarations.

Over in self-hosted I started implementing this feature to see what it would look like and here is what I ended up doing on the first pass (slightly different semantics):

when looking up a decl by name (a.b), all usingnamespace decls are chased, recursively. Non-pub usingnamespace decls are skipped if the a.b syntax is in a different file than the usingnamespace decl. Same when searching for matching decl names - non-pub decls are skipped if the a.b syntax is in a different file than the candidate decl.
emit a compile error if b was found more than one time (ambiguous reference)

This is different than how stage1 does it, where it actually copies the names of decls from one table into another table, skipping the non-public declarations. With these different semantics in stage2, the compiler would only report an error if a name conflict was actually triggered via a.b and b was ambiguous, similar to the semantics of #678. Non-public decls would not count towards an ambiguous reference.

I do think how stage1 does it makes pub inconsistent, because sometimes it means "allowed to be visible outside the file" and other times it means "gets republished by usingnamespace". With this way of doing things, pub has a very clear definition: visible outside the current file.

I think I'm leaning towards doing it the way I outlined here, the way the code kind of came out naturally in stage2. It requires storing fewer things in memory, and feels simpler in terms of dealing with incremental updates. However I'm wide open to feedback on this one.

ifreund commented 3 years ago

I do think how stage1 does it makes pub inconsistent, because sometimes it means "allowed to be visible outside the file" and other times it means "gets republished by usingnamespace".

I had the exact same thought about that inconsistency while writing the many example snippets above. I think your proposed stage2 implementation/semantics would make the language more consistent, so +1 from me!

I'm happy with the name mixin

Cool, It's been growing on me over includedecls as well.

andrewrk commented 3 years ago

I suggest let's nail down the semantics, and make the keyword rename a separate proposal after that's done

andrewrk commented 3 years ago

I've been working on an implementation and the new AstGen error finds a lot of broken code in std:

/home/andy/dev/zig/lib/std/zig.zig:160:101: error: use of undeclared identifier 'suffix'
                return std.fmt.allocPrint(allocator, "{s}{s}{s}", .{ target.libPrefix(), root_name, suffix });
                                                                                                    ^
/home/andy/dev/zig/lib/std/hash_map.zig:612:29: error: use of undeclared identifier 'self'
            var other = try self.unmanaged.cloneContext(new_allocator, new_ctx);
                            ^
/home/andy/dev/zig/lib/std/json.zig:1409:17: error: use of undeclared identifier 'info'
            if (info.tag_type) |UnionTag| {
                ^
/home/andy/dev/zig/lib/std/fmt.zig:905:99: error: use of undeclared identifier 'value'
        @compileError("Unsupported format string '" ++ fmt ++ "' for type '" ++ @typeName(@TypeOf(value)) ++ "'");
                                                                                                  ^
/home/andy/dev/zig/lib/std/math/big/int.zig:678:37: error: use of undeclared identifier 'allocator'
        return gcdLehmer(rma, x, y, allocator);
                                    ^
/home/andy/dev/zig/lib/std/event/rwlock.zig:228:25: error: use of undeclared identifier 'Allocator'
fn testLock(allocator: *Allocator, lock: *RwLock) callconv(.Async) void {
                        ^
/home/andy/dev/zig/lib/std/crypto/25519/curve25519.zig:43:29: error: use of undeclared identifier 'Edwards25519'
    pub fn clearCofactor(p: Edwards25519) Edwards25519 {
                            ^

Seems pretty useful

michal-z commented 3 years ago

Just for information. I have restructured my code. I'm now using usingnamespace keyword only for 'mixing-in' methods from parent COM interfaces. I use proper namespaces instead of prefixes (dxgi.ISwapChain, d3d12.IDevice9, d3d11.IResource, d2d1.COLOR_F, etc.). Everything seems nice and clean and my code is prepared for this language change.

// win32.zig
pub const base = @import("windows.zig");
pub const dwrite = @import("dwrite.zig");
pub const dxgi = @import("dxgi.zig");
pub const d3d11 = @import("d3d11.zig");
pub const d3d12 = @import("d3d12.zig");
pub const d3d12d = @import("d3d12sdklayers.zig");
pub const d3d = @import("d3dcommon.zig");
pub const d2d1 = @import("d2d1.zig");
pub const d3d11on12 = @import("d3d11on12.zig");
pub const wic = @import("wincodec.zig");
pub const wasapi = @import("wasapi.zig");

One more place where I use usingnamespace is to add some basic windows stuff that Zig does not provide (like IUnknown COM interface):

// windows.zig
pub usingnamespace @import("std").os.windows;
pub usingnamespace @import("misc.zig");

Sample application uses it like this:

const win32 = @import("win32");
const w = win32.base;
const d2d1 = win32.d2d1;
const d3d12 = win32.d3d12;
const dwrite = win32.dwrite;

//...
demo.brush.SetColor(&d2d1.COLOR_F{ .r = 0.0, .g = 0.0, .b = 0.0, .a = 1.0 });
//...

//...
const cmdqueue = blk: {
    var cmdqueue: *d3d12.ICommandQueue = undefined;
    hrPanicOnFail(device.CreateCommandQueue(&.{
        .Type = .DIRECT,
        .Priority = @enumToInt(d3d12.COMMAND_QUEUE_PRIORITY.NORMAL),
        .Flags = d3d12.COMMAND_QUEUE_FLAG_NONE,
        .NodeMask = 0,
    }, &d3d12.IID_ICommandQueue, @ptrCast(*?*c_void, &cmdqueue)));
    break :blk cmdqueue;
};
//...

andrewrk commented 3 years ago

After exploring what it would look like to implement usingnamespace with these semantics in self-hosted, as well as exploring what std lib changes would need to be made to adjust to these semantics, I'm confidently marking this as accepted.

@SpexGuy pointed out to me that this proposal also solves a design flaw that I ran into in self-hosted, because it allows AstGen to make note of all the ZIR instructions that are referenced inside any given Decl. This is a solution to a fundamental problem, unblocking progress on the self-hosted compiler.

andrewrk commented 3 years ago

See #9618 for an ongoing effort to implement this.

Mouvedia commented 3 years ago

I'm happy with the name mixin

When you are talking about mixins, it's almost always accompanied with a custom merging strategy—and a default one. I don't know if that would be useful there.

andrewrk commented 3 years ago

Implemented in 594271f8dba0143280990ac2e01dd68a791c05b0. The keyword rename can be a separate proposal.

windows-h8r commented 2 years ago

Are all includenamespace declarations pub? I mean, what would be the point of a non-pub includenamespace?

InKryption commented 2 years ago

@windows-h8r

Are all includenamespace declarations pub? I mean, what would be the point of a non-pub includenamespace?

pub const sub_namespace = struct {
    includenamespace @import("file1.zig");
    pub includenamespace @import("file2.zig");
};

comptime {
    _ = sub_namespace.private_decl;
}

ziglang / zig

make `usingnamespace` not put identifiers in scope #9629