microsoft / hlsl-specs

HLSL Specifications
MIT License
123 stars 34 forks source link

[Feature Request] Define language features in the specs #102

Closed RedSkittleFox closed 1 year ago

RedSkittleFox commented 1 year ago

Is your feature request related to a problem? Please describe. The best grammar documentation available for HLSL is MSDN documentation. However, currently it is incomplete and outdated.

Describe the solution you'd like A complete or partial grammatical definition of the language. Clarification/comprehensive list of keywords, identifiers with special meaning, built-in types, and functions. As much as HLSL is similar to C++, it is crucial to highlight the differences. Grammar for most important statements - primarily expanded declarations syntax, should be included in the standard.

Keywords, intrinsics, and identifiers with special meaning

It's important to formalize this because it determines whether stuff like this is allowed:

struct module : import
{
    ::override override() override
    {
        return {};
    }

    ::final final() final
    {
        return {};
    }
};

Identifiers with special meaning and built-in types act like regular identifiers, keywords are reserved for specific contexts.

I've noticed an inconsistency (according to MSDN) on what is considered a keyword, what is considered a builtin-type/intrinsic, and what is considered an "identifier with special meaning" (official C++ term [tab:lex.name.special]). The general trend is that snake_case identifiers are keywords, and CamelCase identifiers are built-in types. This is just a rule of thumb though. Here is a simple test program on Compiler Explorer that checks the kind of built-in type (identifier with special meaning).

I propose the following formalization of identifier/builtin-type/keyword classifications:

Distinguish between keywords and identifiers with special meaning. Consider built-ins a subcategory of the latter.

Keywords:

Reserved C++ keywords:

Identifiers with special meaning

Built-in types

I propose that precise is made a keyword just like the rest of the storage class specifiers. In this way, "identifiers with special meaning" category does not have to exist.

Keyword unsigned is redundant and an outlier. HLSL has built-in sign-qualified types, signess modifier applied on int types yields an already existing built-in types.

Additionally for the sake of consistency vector, matrix, string, min16float, min10float, min16int, min12int, min16uint, uint64_t, int64_t, float16_t, uint16_t, int16_t, should also be made keywords.

I do acknowledge that these proposed changes do not make much sense from the practical point of view and might as well be left unimplemented however I believe it is important to formalize this in a language and do it consistently rather than rely on already existing DXC grammar implementation.

Deprecated keywords

Deprecated special meaning identifiers:

Deprecated keywords and identifiers should be marked as so in the specs and removed in the subsequent HLSL releases.

Preprocessor

HLSL's current preprocessor is somewhat inconsistent with C++'s.

// Compiles on CLANG 17.0.1
// Fails to compile on DXC 1.7.2207

#define FUNC(X, Y) \
    void X##0x##Y () {} 

FUNC(a, b);

This is a compiler bug, but the preprocessor's behavior should be clarified in the standard.

String and char-literals

The grammar of string and char literals differs from the C++ grammar. HLSL does not allow encoding prefixes, multichar literals, and raw string literals.

Declarations grammar

Currently, C-inherited (static, extern, unsigned) modifiers can appear after the type inside variable declarations. Other modifiers cannot. Mandate that all modifiers must appear before the type.

static unsigned int v; // valid
int unsigned static v; // currently valid, should be invalid 
unorm float v; // valid
float unorm v; // invalid

Propose a grammar notation convention and include the grammar for the HLSL declarations.

Example: The standard notation for C++(Chomsky normal form):

variable-decl:
    modifier-seq_{opt} type declarator array-declarator_{opt} hlsl-semantic_{opt} initializer_{opt}

modifier-seq:
    storage-class storage-class-seq
    type-modifier storage-class-seq

storage-class: one of
    extern-keyword, precise-keyword, shared-keyword, groupshared-keyword, static-keyword, uniform-keyword, volatile-keyword

type-modifier: one of
    const, row_major, column_major, snorm, unorm

type:
    built-in-type
    user-defined-type 
    type-alias 

declarator:
    identifier

array-declarator:
    '[' positive-number ']'

hlsl-semantic:
    semantic-decl
    packoffset-decl
    register-decl

semantic-decl:
    ':' identifier

... and so on ...

Example: An already existing MSDN documentation.

[Storage_Class] [Type_Modifier] Type Name[Index] [: Semantic] [: Packoffset] [: Register]; [Annotations] [= Initial_Value]

Built-in types/functions

Provide a category-sorted documentation of all available intrinsic types and functions. The comprehensive and complete list of currently available intrinsics can be found here in the DXC implementation.

llvm-beanz commented 1 year ago

I'm going to try to walk through all your points here, but this is a bit tricky because it isn't really a single issue, but a whole host of issues that you're walking through.

We need a proper spec

HLSL has never had a proper specification. We have two reference implementations (FXC & DXC), and we're working on a third (Clang). That is made much more difficult by not having a specification, which is why I started writing one: https://github.com/microsoft/hlsl-specs/tree/main/specs/language

It is going to be a lot of work and take a lot of time to get that to be what you're asking for here, so you'll just have to be patient.

We should document our grammar

I've also been trying to write grammar annotations for new features as we add them so that we can get this right for features moving forward:

https://github.com/microsoft/hlsl-specs/pull/65

The MSDN documentation is woefully out of date, and that's seriously unfortunate.

Keywords

Precise

precise is a keyword in HLSL. It is not a storage class though because it doesn't actually apply to the variable it is defined on, it applies to operations that feed into that variable. Ultimately I'd like to deprecate precise in favor of explicit math modes.

C/C++ Keywords

We're not going to deprecate unsigned, our goal is to be more compatible with C/C++, not less.

We do have a bunch of undocumented but reserved keywords in HLSL (__is_signed, declspec, forceinline, auto, catch, const_cast, delete, dynamic_cast, explicit, friend, goto, mutable, new, operator, protected, private, public, reinterpret_cast, static_cast, throw, try, union, and virtual).

Additionally there are C/C++ reserved words and keywords that have leaked into HLSL through the implementation of DXC which we don't have well scoped or defined.

Built-in types

The long-term goal for built-in types is to not really make them built-in. We want to be able to represent the HLSL data types in HLSL, and make them proper well-formed data types. Due to bugs in the compiler things get really wonky if you try to define new types with matching names (#5738). We do have a proposal to move built-in types into the hlsl namespace so that they don't conflict with other type names. We also have a bunch of bugs that we're working on related to name lookup and template argument resolution that come into play here, but those are just bugs.

Deprecating Old Stuff

I agree that we need to be better about deprecating old keywords and I appreciate your list here. That's a good list to hit. We've also talked about depreciating the cbuffer/tbuffer syntax too.

Sooner on the chopping block are:

HLSL Preprocessor

In the future I expect HLSL to adopt the C preprocessor explicitly with only a few additions and no differing behavior. Right now HLSL is in an odd space. FXC's preprocessor wasn't built to the C standard, and DXC's adopted some of the behaviors of the older MSVC preprocessor and some of FXC's behavior. Sans compiler bugs, our goal is to adopt the C specifications preprocessor support, but that won't be codified for a while.

String and character types

Yea, this is super messy in HLSL, and we need to do something better. We don't really have an answer here yet because DXIL doesn't really support strings either.

Declarations

I don't think we want to deviate from the C grammar for declarations. I get that the C grammar is dated and subjectively unappealing, but our stated goal for HLSL is to become more C/C++-like not less.

Documentation

A few times you've made a point about how bad our documentation is. You're right, it is not great. We have a lot of work to do.

Summary

I appreciate all the feedback here and the time you took to put this together. My high level takeaways here are: 1) We need better documentation (both user and technical docs). 2) We should be more aggressive about deprecating and issuing deprecation warnings in DXC.

Besides these points I don't think there is a specific and actionable request here that we can work on other than continuing to work on the things we're already doing. Since I don't think there is anything specific to track here, I'm going to close this issue as not planned.

If I missed something specific please let me know.