ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.65k stars 2.53k forks source link

language proposal: rework the implicit file-scope struct declaration #19448

Open expikr opened 7 months ago

expikr commented 7 months ago

Motivation

Files are implicitly structs, which lets you do things like:

//! Allocator.zig
ptr: *anyopaque,
vtable: *const VTable,
pub const VTable = ...
//! main.zig
const Allocator = @import("Allocator.zig");

const allo = Allocator {
    .ptr = *myAllocFn,
    .vtable = ...
};
...

which implicitly expands into

//! main.zig
const Allocator = struct {
    ptr: *anyopaque,
    vtable: *const VTable,
    pub const VTable = ...
};

const allo = Allocator {
    .ptr = *myAllocFn,
    .vtable = ...
};
...

This convenience technique is widely used in the stdlib for concrete types.

However, because @import essentially wraps an implicit struct{ ... } around the file content, it is not possible to declare packed structs or enums this way.

For example, imagine a cumbersome file that declares a u8 enum of keycodes, this is how it must be done at the moment:

//! keycodes.zig
pub const Keycode = enum(u8) {
    ...
    Key8 = '8',
    Key9 = '9',
    A = 'A',
    B = 'B',
    ...
    Num1 = 0x61,
    Num2 = 0x62,
    ...
    unmapped = 0xFF,
};
// nothing else in file-scope

its usage entails const Keycode = @import("keycodes.zig").Keycode;
rather than just const Keycode = @import("keycodes.zig");
even though the file defines nothing else.

Proposal

What I would like to propose is to
(1) make @import scopes implicitly an expression rather than being the body of an implied struct{ ... }, and
(2) make status-quo bare declaration files implicitly wrapped inside a non-instantiable namespace construct.

To illustrate, status quo namespace files will still look exactly the same:

///! std.zig
pub const ArrayHashMap = array_hash_map.ArrayHashMap;
pub const ArrayHashMapUnmanaged = array_hash_map.ArrayHashMapUnmanaged;
...
//! main.zig
const std = @import("std");

pub fn main() void {
    std.debug.print("hello world", .{});
}

which implicitly expands into

//! main.zig
const std = opaque {
    pub const ArrayHashMap = array_hash_map.ArrayHashMap;
    pub const ArrayHashMapUnmanaged = array_hash_map.ArrayHashMapUnmanaged;
    ...
};

pub fn main() void {
    std.debug.print("hello world", .{});
}

But file-scope Types must now be prefixed with a type definition keyword, i.e.

//! Allocator.zig
struct {
    // helper imports and decls
    ...

    // The type erased pointer to the allocator implementation
    ptr: *anyopaque,
    vtable: *const VTable,
    pub const VTable = ...
}
// no semicolon, since this is an expression
// nothing else can be defined in file-scope

While this creates the disadvantage for the code writer of needing to indent everything by one extra level, in return we receive the advantage that the code reader can more readily expect to find field definitions somewhere in the middle of the file, aside from just filename capitalization which may or may not be lying.

It also lets us more cleanly quarantine unwieldy extern struct definitions into individual files:

//! NOTIFYICONDATAA.zig
extern struct {
    // helper imports and decls
    const std = @import("std");
    const win = std.os.windows;

    // the actual struct definition
    cbSize: win.DWORD,
    hWnd: win.HWND,
    uID: win.UINT,
    uFlags: win.UINT,
    uCallbackMessage: win.UINT,
    hIcon: win.HICON,
    szTip: if(isLaterThanWin2k()) [127:0]win.CHAR else [63:0]win.CHAR,
    dwState: win.DWORD,
    dwStateMask: win.DWORD,
    szInfo: [255:0]win.CHAR,
    DUMMYUNIONNAME: extern union {
        uTimeout: win.UINT,
        uVersion: win.UINT,
    },
    szInfoTitle: [63:0]win.CHAR,
    dwInfoFlags: win.DWORD,
    guiItem: win.GUID,
    hBalloonIcon: win.HICON,

    // a bunch of ad-hoc helper functions for parsing/constructing this godforsaken abomination
    ...
}
// no semicolon, since this is an expression
// nothing else can be defined in file-scope

Additionally, with the benefit of generally being an expression, one can achieve comptime conditional declarations by making the file-scope an execution block, which supercedes certain usecases of usingnamespace:

//! MyThing.zig
comptime do: {
    const MyThing_variant_A = struct {
        data1: u32, data2: u32,
        const operate = @import("variant_A").operate;
    };
    const MyThing_variant_B = struct {
        data1: u32, data2: u32, data3: u32,
        const operate = @import("variant_B").operate;
    };
    break :do if(compile_variant_A()) MyThing_variant_A else MyThing_variant_B;
}

Or directly return generic type functions:

//! MyGeneric.zig
(opaque {
    fn MyGeneric(comptime T: type) type {
        return struct {
            data1: T, data2: T, data3: T,
            pub fn ...
        };
    }
}).MyGeneric
//! main.zig
const Generic_f32 = @import("MyGeneric.zig")(f32);
const Generic_f64 = @import("MyGeneric.zig")(f32);
pub fn main() void {
    ...
}

Furthermore, it also allows us to directly import ZON files as valid zig code, since ZON itself is an expression as well.

This is somewhat comparable to Lua's require syntax being implicitly function returns, which lets you procedurally construct what you want to return:

-- Thing.lua
local ret = {}
local function doMyThing_general(arg)
    ...
end
local function doMyThing_win32(arg) 
    ...
end
local function doMyThing_linux(arg)
    ...
end
do
    ret.generalThing = doMyThing_general
    if isWin32() then 
        ret.specificThing = doMyThing_win32
        ret.os = win32
    elseif isLinux() then
        ret.specificThing = doMything_linux
        ret.os = linux
    else
        ret.specificThing = ret.generalThing
    end
end

return ret
-- main.lua
local Thing = require('Thing')
local current_os = Thing.os
local specificResult_of_3 = Thing.specificThing(3)
...
Vexu commented 7 months ago

Related https://github.com/ziglang/zig/issues/7881#issuecomment-767069673

expikr commented 7 months ago

It just occurred to me that yet another advantage of (2) is the usecase of defining foreign opaque types in a standalone file right alongside their associated extern functions, e.g.

//! HBITMAP.zig
const HBITMAP = *@This();
const win = @import("std").os.windows;

extern "gdi32" SelectObject(HDC, HBITMAP) callconv(WINAPI) HBITMAP;
pub fn select(h: HBITMAP, hdc: HDC) HBITMAP {
    return @ptrCast(SelectObject(hdc, h));
}

pub extern "gdi32" CreateBitmap(c_int, c_int, UINT, UINT, *const anyopaque) callconv(WINAPI) HBITMAP;
pub extern "gdi32" CreateBitmapIndirect(*const BITMAP) callconv(WINAPI) HBITMAP;
pub extern "gdi32" CreateCompatibleBitmap(HDC, c_int, c_int) callconv(WINAPI) HBITMAP;
...
vadim-za commented 7 months ago

Related #7881 (comment)

I was thinking about another option:

//! SomeStruct.zig
extern struct =>

field: u32,
.......

The => at this position means "everything till the end of the file is implicitly enclosed in { }s".

This can be generalized further to reduce indentation in files containing a single generic (compared to the proposal here to use comptime blocks, which adds 2 extra levels of indentation to the 2 already existing). E.g.

//! SomeGeneric.zig
pub fn SomeGeneric(comptime T: type) type =>
const size = @sizeOf(T);
return struct =>

data: [size]u8,
pub fn method(self: @This()) void {
   ......
}  

although there are obvious readability concerns. Besides, in the latter case one would really wish one also didn't have to explicitly write the generic name and use the filename instead, pretty much like in the first example.

Edit: the latter problem probably could be addressed as

//! SomeGeneric.zig
fn (comptime T: type) type =>
const size = @sizeOf(T);
return struct =>

data: [size]u8,
pub fn method(self: @This()) void {
   ......
}  

however that implicitly suggests the introduction of a second syntax for function definition in Zig:

const SomeGeneric = fn(comptime T: type) type { ....... };
paperdev-code commented 7 months ago

I think, if you really wanted to be able to make enums, or anything else at the file scope. Something special like the following could perhaps do the trick;

@This = enum(u8) {
    A = 'a',
    ...
};

// or
// @This = error { }
// etc

The idea being that the following is usually always implied;

@This = struct {
// <file contents>
};

This would be considered very special and unique syntax however, and I think the following rule would apply;

Additionally this could allow for some silly trickery like;

@This = @Type({ ... });

But I don't think it's a big deal. Although I still don't like how this is still kind of new syntax and implies something about the @This() builtin. But It also kind of makes sense, considering it is also the return value of @This(), which has always been kind of recursive in nature.

paperdev-code commented 7 months ago

Related #7881 (comment)

Realized I suggested nearly the exact same thing. Edit; Upon further inspection, I definitely expanded upon it.

expikr commented 7 months ago

Related #7881 (comment)

Realized I suggested nearly the exact same thing. Edit; Upon further inspection, I definitely expanded upon it.

If prefixing with @This = is always required then it adds no syntactic information, why include it at all?

If file is a valid... expression @This = <rhs>; statement body
This proposal@import() == <file_content> invalid opaque{ <file_content> }
Your proposal@import() == invalid <file_content>.ast().rhs struct{ <file_content> }
Status quo@import() == invalid invalid struct{ <file_content> }
eastmancr commented 5 months ago

I like this proposal, I'd like to see files truly become abstractions for all zig containers, not just structs. I also prefer the proposed brace wrapping when a file contains fields, I don't think a new syntax is warranted to keep the indent level at 0. As someone who browses the standard library source code more than the autodocs, explicit file typing would help me differentiate between module entrypoints and instantiable types that are just broken out as files.