Open straight-shoota opened 1 year ago
The types like LibC::Int
will stay, right?
Ay, that's a good point. I hadn't intended that. But it certainly would make sense to have basic C types easily available.
In fact, you need them in any lib type. Thus it's common to have aliases like these for convenience: https://github.com/crystal-lang/crystal/blob/b2498a37e4c9588baca0a927e8a69f1136742c18/src/crystal/lib_iconv.cr#L11-L13
I'm wondering if there could be alternatives to make this more convenient. Maybe have basic C types implicitly defined in any lib type? Or put them in a C
namespace that you can include into a lib type?
There is an added complication here: SizeT
is technically part of c/stddef
, not lib_c
, but I definitely don't want to see third-party code write alias SizeT = Crystal::System::LibC::SizeT
.
I think that is just a technicality. size_t
is the type of the sizeof
operation, which is part of the language. So I'd argue that by extension, the type size_t
is also part of the language. Its definition just happens to be in a header file.
So I think we can consider it a basic type in C.
The other standard types are the integer and float type for which we have aliases in lib_c.cr
, as well as Void
, Pointer
, Proc
and StaticArray
.
My idea would be to put these standard C types in a namespace called C
(as the language, not the library).
And this namespace would be implicitly included in all lib
namespaces.
Will the compiler disallow things other than alias definitions in module C
?
Probably not, why?
I mean it should be discouraged to use it for anything else. But I don't think the compiler needs to enforce it.
Because most likely we'd start with normal lib
s using the normal semantics of include
to include a normal module
. Something in this process must be special to forbid things like def
s inside a lib
Ah yes. That should happen when including C
into a lib, not when defining it.
However, maybe C
should actually be a lib
type? This obviously requires compiler support, but so would including a module into a lib. There's not much reusability for that, so I figure this could be entirely encapsulated in the compiler.
I am also worried that C
might be too common a name im the top-level namespace (snippets using A
, B
, C
etc. instead of Foo
, Bar
etc.). Maybe CTypes
is better.
Perhaps including lib
s into other lib
s is a better choice here, both implicitly in the compiler, and explicitly in the code, although we would then also need compiler support for documenting the base lib.
Also the platform-specific directory selection in lib_c
is hardcoded in the compiler. Do we continue using src/lib_c
for Crystal::System::LibC
?
I suppose so. There should be no issue with that except that it's in the namespace Crystal
but not in src/crystal
, right?
Maybe symlinking src/lib_c
to src/crystal/lib_c
could be an option to improve on that...?
I think as a first step we should put those types in a new module CTypes
, then make LibC
's types aliases to them (or Crystal::System::LibC
's types too once we have that):
# src/c_types.cr
module CTypes
alias Int = Int32
end
# src/lib_c.cr
require "c_types"
module LibC
alias Int = CTypes::Int
end
Being able to include CTypes
in libs is a nice convenience but not necessary; many bindings only need a handful of those types, e.g. PCRE2 only needs Int
and SizeT
, and writing out just the needed aliases doesn't hurt much.
Is there really a problem to fix?
What's wrong with src/lib_c
and LibC
in general? What's the point of introducing yet another module for holding C types when we already have LibC
that's already used and expected?
I understand that the bindings are incomplete, that it can break if we're removing symbols the stdlib doesn't need anymore, but LibC
won't go away anytime soon (and IMO never). Looking at the problem from another point of view, we could introduce a libc
shard that would aim to be complete (and ideally autogenerated), and there wouldn't be an issue. Heck, crystal could eventually depend on it :shrug:
I'm wondering if there could be alternatives to make this more convenient
Yes, drop the LibXYZ libs and put every extern C libs into LibC directly! No more artificial LibXML2 or LibPCRE that push us to alias every single symbol when we can use the actual symbol names, no need to alias C types: they're already available.
The C runtime may not be present even though the C language and calling conventions etc. are there, e.g. when writing pure Win32 code and needing to avoid linking against ucrt
/ libucrt
. Thus there is a valid scenario in which src/lib_c.cr
and src/lib_c/*/stddef.cr
won't be required at all (though I guess CTypes
somewhat blurs the line, because it refers to size_t
from the C standard library whereas the Win32 SIZE_T
is technically a different type).
But another advantage of CTypes
is public documentation. Publishing those types here means we avoid the hard problem of deciding whether the docs generator should support lib types.
Yes, drop the LibXYZ libs and put every extern C libs into LibC directly!
That would mean passing all the linker flags from every single lib as soon as the libs are required, even when the source code doesn't call any of those lib functions, because under normal circumstances you can't avoid calling into the C runtime.
I agree 100% with @HertzDevil that separatation into individual lib
types is a useful feature of Crystal to avoid implicitly linking libraries which are never called.
I think it makes a lot of sense to define the basic C data types outside of LibC
because they are also used outside of it: Other lib
bindings, syscalls, FFI, binary protocols.
I think there is a confusion around the extern C world: IMO LibC
is that whole extern C world (types, syscalls, general ABI), not just a binding to the C library. I wish we kept using the C
namespace for it instead of the restrictive LibC
which forces us to create even more modules just for holding a dozen type aliases (just to alias them again everywhere) :disappointed:
binary protocols
We don't want arch-dependent sizes for binary protocols (different targets would fail to communicate). Even ELF files use explicit integer sizes for example :grin:
That would mean passing all the linker flags from every single lib as soon as the libs are required
Good point. Still, that's only really a problem for PCRE in practice, because it's always required by the prelude. The GC is always used & linked, and other bindings (LLVM, OpenSSL, XML2, ...) must be explicitly required, in which case I don't see a problem to link against the library; it won't get linked if unused anyway.
The easiest implementation to retain the API would probably be
alias LibC = Crystal::System::LibC
This ain't backward compatible: we can't reopen LibC
anymore and would have to reopen Crystal::System::LibC
(sic) or use another name (sic) to have a libc
shard or quickly declare a libc function (e.g. mknod
): Error: LibC is not a lib, it's a alias
.
Another advantage of CTypes is public documentation
I think that's a problem: the types are target dependent but the docs will declare them as being an alias to a specific type, that depends on which target was passed to the docs generator. This can be confusing, if not misleading.
The C type names could instead be manually documented in the guide about extern C bindings, with actual examples (not hardcoded aliases), or maybe even the list per target; that would be more useful.
That would mean passing all the linker flags from every single lib as soon as the libs are required
Good point. Still, that's only really a problem for PCRE in practice, because it's always required by the prelude. The GC is always used & linked, and other bindings (LLVM, OpenSSL, XML2, ...) must be explicitly required, in which case I don't see a problem to link against the library; it won't get linked if unused anyway.
PCRE may be the only practical application currently in the core library. But we have an entire ecosystem to take into account. It's not that uncommon for libraries to define C bindings which may or may not be actually linked, depending on which parts of the library are used.
Due to the way the Crystal compiler autogenerates linker flags, this would mean the linker may need libraries to be available, despite not being used.
It's hard to gauge the effect of this, but I'd certainly expect that changing this would introduce a number of breakages. So we wouldn't be able to implement it easily.
And I really don't understand why merging all C bindings into a single lib type should even be considered a good idea. Sure, in C there are no such namespaces or a notion about which library defines a symbol. But I think it's a great enhancement that Crystal has a means to provide more structure.
Yes, drop the LibXYZ libs and put every extern C libs into LibC directly! No more artificial LibXML2 or LibPCRE that push us to alias every single symbol when we can use the actual symbol names
I don't think there's anything preventing us from using the actual symbol names. It's just conventional to drop the lib namespace prefixes, so that the Crystal name for __gmpz_init
is LibGMP.init
. You could just keep the original name if you wanted. But I don't see a lot of benefit from that. It just makes calls more complex.
Putting individual libraries' bindings into separate namespaces is also just a convention, but it has some practical consequences for generating linker arguments.
I prefer using actual symbols because it's the actual name and we should always properly name things. Creating fun aliases or renaming struct names, we create a virtual interface, which requires one more hoop to go from Crystal to C and vice versa.
For example when reading the shard's code, you must reach the lib definition to know the actual symbol to start searching its original documentation. Same when trying to understand a segfault stacktrace, or why the linker fails with missing symbols. It's one more hoop, but a boring one, then repeat with how many symbols you want to reach :rage:
In the end we spend time on an abstraction that nobody will use in the end, and that only makes things harder to follow & decipher :confused:
I prefer to cram everything together because this is the reality: everything's crammed together in C, and that avoids to repeat the same aliases to C types in each and every lib definition, without any protection, because I can create a new LibFoo::SizeT that aliases UInt64 and that would work... until you compile for ARM32 and get segfaults.
Though that's only for types and constants, because fun
definitions are shared across all lib definitions and global definitions: they must have the same signature, otherwise it's a compilation error. So the namespacing is both real (types) and fake (funs), protected (funs) and unprotected (types).
Cramming everything together avoids the above pitfalls, and we don't document lib definitions (on purpose: they're not meant to be used), so we don't care that everything's crammed together.
Now it's under par because of the linker annotations, but maybe we could think differently? For example instead of merging the annotations each time we reopen lib X
, maybe they could only apply to their immediately defined symbols only?
Or maybe we don't need to do anything, or each lib could always include the C types by default, we could recommend to always use the actual symbols (don't waste your time on that intermediate layer, better: autogenerate it).
The bindings in
LibC
are incomplete and not intended as public API exposed in the standard library. They only go as far as needed in terms of building blocks for stlib implementations.Moving the bindings into
Crystal::System
namespace explicitly communicates that they are an internal tool and not a public API.This has been mentioned before in https://github.com/crystal-lang/crystal/issues/7339#issuecomment-612413093 and https://github.com/crystal-lang/crystal/pull/11955#issuecomment-1105559259
It's to be expected that
LibC
bindings from stdlib are called from user code. It's not intended for that, but this was not explicitly communicated in the past.LibC
is not mentioned in the API docs because the code generator doesn't produce docs for lib types. So that could be understood as a technical limitation, not a intentional choice. I think it would be too hard to entirely removeLibC
from stdlib and break existing code. I'd rather keep such a breaking change for the next major release, even though we might not critically bound to it. However, there's also not a huge cost for keeping the status quo around.The easiest implementation to retain the API would probably be
alias LibC = Crystal::System::LibC
. We could also consider duplicating the bindings so we can move independently, but I fear this would be too much overhead and complexity.