KhronosGroup / OpenGL-Registry

OpenGL, OpenGL ES, and OpenGL ES-SC API and Extension Registry
677 stars 274 forks source link

Commands lack "tidy" names #597

Closed Rytis-Stan closed 8 months ago

Rytis-Stan commented 9 months ago

Commands contain various character sequences in their names that are used to distinguish between different versions of the same commands, with different sets or arguments. While they do serve a purpose, they make command names rather messy. When generating bindings for OpenGL, it is often desirable to be able to generate "nice" looking names, without the extra characters, especially if the target programming language allows for function/procedure/method overloads.

I propose to introduce these "nice" names as additional information, which could be done by simply adding an extra attribute to <proto> tags. Not sure about the exact name of the attribute, but it could be something along the lines of simpleName, canonicalName, normalizedName, or something similar (naming suggestions would be greatly appreciated).

Some examples of how the currently existing command names could be transformed:

glEnable  --> Enable
glEnablei --> Enable

glGetMultisamplefv --> GetMultisample

glGetNamedBufferParameteri64v --> GetNamedBufferParameter
glGetNamedBufferParameteriv   --> GetNamedBufferParameter

glTexCoord2d  --> TexCoord2
glTexCoord2dv --> TexCoord2
glTexCoord2f  --> TexCoord2

glGetTexEnvfv --> GetTexEnv
glGetTexEnviv --> GetTexEnv
glGetTexEnvxv --> GetTexEnv

glGetTexEnvxvOES --> GetTexEnvOes

glTranslated --> Translate
glTranslatef --> Translate
glTranslatex --> Translate

glTexBumpParameterfvATI --> TexBumpParameterAti
glTexBumpParameterivATI --> TexBumpParameterAti

Note that the examples include the removal of the gl prefix, mainly due to it acting like a package/namespace mechanism, which might be redundant for the target language. And things like acronyms get changed to fit a chosen naming convention (ATI --> Ati), which could be any naming convention: PascalCase, camelCase, snake_case (though I would prefer PascalCase). The most important thing is that it would be possible to unambiguously distinguish each word in the name, so that it would be easy to convert from one naming convention to another.

Perksey commented 9 months ago

I think this is something that: a) can be inferred based on the information already available in the registry b) basically all of the bindings that want this info have logic to determine this anyway c) can easily be interpreted as opinionated (e.g. what if the bindings want to strip off the vendor as well, or just strip off the vendor but keep the datatype, etc etc - it's all personal preference)

Given that new functions being added to the spec is of great infrequency, any logic written already against this registry to determine the names you propose is extremely unlikely to break in the future. And, given that this code already exists and can easily be retrieved from the open-source projects that implement it, the benefits of adding this to the registry now are likely very niche & fringe.

I do not think this is a necessary addition to the registry given that it does not add new info that would be of benefit to many downstream consumers.

SunSerega commented 9 months ago

a) can be inferred based on the information already available in the registry

Can it though? Separating the vendor suffixes is already a journey of its own... It's easy enough to write some general rules, but making this robust is a different story.

Also, any info can be copied between bindings, but I think the main benefit of having the XML files is being able to use it as a common store for things that multiple different parties can use and maintain in a machine-readable way.

That said, ATI -> Ati really is an arbitrary choice. For XML to be equally usable for any party, it's better to avoid stuff like this. How about something like this instead:

        <command>
            <proto tsuffix="fv">void <name>glTexBumpParameterfvATI</name></proto>
            <param group="TexBumpParameterATI"><ptype>GLenum</ptype> <name>pname</name></param>
            <param len="COMPSIZE(pname)">const <ptype>GLfloat</ptype> *<name>param</name></param>
        </command>

The gl prefix is in most cases equal to the group namespace (file name without extension), but is missing for some functions in wgl.xml (IIRC that's because they are GDI rather than WGL functions). There are no cases where the prefix exists but isn't equal to the namespace. And so the prefix can already be parsed robustly.

tsuffix here uniquely specifies which part of the name (between gl and the last fv) is the core of the name.

As a bonus, since tsuffix is always the end of the name, excluding the always capitalized vendor suffix - this adds a point of easily noticeable failure, in case the consumer didn't expect some vendor suffix - which might have also been missed in other parts of XML. Or in case the kind and len attributes don't match the suffix.

I'd also suggest tsuffix for glTexCoord2dv being 2dv. This is still easy to robustly parse but gives separation between the core (specifying texture coordinate) and the specifics of input types (2 doubles in the form of a vector).

NogginBops commented 9 months ago

While I do appreciate having more info for "mangling" the OpenGL entry points for languages that have overloads I do think this might be in the realm of "could be nice to have, but the effort to introduce this over the entire database would not be worth it".

I also suspect that there will be entry points that don't follow the naming convention and that we will have some sort of disagreement somewhere about what should be part and what should not be part of the name.

But if someone is willing to do the work I think I agree with @SunSerega that tsuffix="fv" does seem to be a good idea, though it doesn't really help with removing vendor postfixes as some entry points do not have any type suffixes to separate vendor names from command name.

SunSerega commented 9 months ago

though it doesn't really help with removing vendor postfixes as some entry points do not have any type suffixes to separate vendor names from command name.

Yeah, I didn't mean it as a main way to find vendor suffixes, I only meant it as another sanity check.

Rytis-Stan commented 9 months ago

a) can be inferred based on the information already available in the registry

Inferring works only for some cases, unfortunately. For example, for glNormal3fv it's relatively easy to infer that fv are the parameter type characters due to there being a digit, after which all characters are lowercase. But what about names like glLightf, for example. How would one reliably determine that f means a parameter type?

b) basically all of the bindings that want this info have logic to determine this anyway

The more different naming patterns there are, the more logic there needs to be for each case. And for names like the glLightf I see no other way than to make the code "know" that certain names need to be treated in a special way. This would basically hard-code some of the names in the logic, to allow a correct conversion. But that goes against the whole point of having some kind of a specification for an API as a single source of truth (as the "gl.xml" file, in this case). And also, "gl.xml", "glx.xml", "wgl.xml" files are supposed to have the same kind of structure, without adding special processing rules for each file separately.

c) can easily be interpreted as opinionated (e.g. what if the bindings want to strip off the vendor as well, or just strip off the vendor but keep the datatype, etc etc - it's all personal preference)

The specific "tidy" name mechanism can be changed to accommodate for different naming schemes. For example, vendor suffixes could be specified as a separate attribute.

Given that new functions being added to the spec is of great infrequency, any logic written already against this registry to determine the names you propose is extremely unlikely to break in the future. And, given that this code already exists and can easily be retrieved from the open-source projects that implement it, the benefits of adding this to the registry now are likely very niche & fringe.

I do not think this is a necessary addition to the registry given that it does not add new info that would be of benefit to many downstream consumers.

The whole reason for a file like "gl.xml" to exist is so that anyone can grab it anytime and generate any bindings code they care about. It's an official specification of the API and should be accessible to everyone who is interested. It should be simple, informative, and unambiguous, as much as possible. If the user needs to do additional manual work to make the generated code work, that goes against the whole idea of such a specification.

Rytis-Stan commented 9 months ago

tsuffix here uniquely specifies which part of the name (between gl and the last fv) is the core of the name.

Excuse me, but what does the t mean in tsuffix?

And I don't think that this attribute would work in it's current form, as there are more complicated names like glReplacementCodeuiColor3fVertex3fvSUN, for example.

Rytis-Stan commented 9 months ago

While I do appreciate having more info for "mangling" the OpenGL entry points for languages that have overloads I do think this might be in the realm of "could be nice to have, but the effort to introduce this over the entire database would not be worth it".

If by "worth it" you mean that it would require a lot of effort for some person to perform the changes, then I would be willing to try to do the work. Of course, it only makes sense if we can reach some satisfying agreement on how the changes should look like.

I also suspect that there will be entry points that don't follow the naming convention and that we will have some sort of disagreement somewhere about what should be part and what should not be part of the name.

There will be edge cases, of course. There is probably no other way than to just try to come to an agreement on what to do about them.

In fact, there was one situation that I had to deal with. It was one of those cases where a pattern-based name fix could be applied to remove the parameter type indicating characters:

glGetQueryBufferObjecti64v  --> GetQueryBufferObject
glGetQueryBufferObjectui64v --> GetQueryBufferObject

Everything looked good, until I tried to compile the generate code. It failed due to both functions having the same exact signature, making function overloading useless. But that was due to both commands actually having the same exact parameters:

<command>
    <proto>void <name>glGetQueryBufferObjecti64v</name></proto>
    <param class="query"><ptype>GLuint</ptype> <name>id</name></param>
    <param class="buffer"><ptype>GLuint</ptype> <name>buffer</name></param>
    <param group="QueryObjectParameterName"><ptype>GLenum</ptype> <name>pname</name></param>
    <param><ptype>GLintptr</ptype> <name>offset</name></param>
</command>
<command>
    <proto>void <name>glGetQueryBufferObjectui64v</name></proto>
    <param class="query"><ptype>GLuint</ptype> <name>id</name></param>
    <param class="buffer"><ptype>GLuint</ptype> <name>buffer</name></param>
    <param group="QueryObjectParameterName"><ptype>GLenum</ptype> <name>pname</name></param>
    <param><ptype>GLintptr</ptype> <name>offset</name></param>
</command>

This looks like a flaw in naming the commands properly (and the same issue exists for the glGetQueryBufferObjectiv command).

SunSerega commented 9 months ago

Excuse me, but what does the t mean in tsuffix?

Type. I was mimicking the <ptype> naming style. Feel free to rename it to something better.

names like glReplacementCodeuiColor3fVertex3fvSUN

Right... That one extension...

Honestly, after making this and all of this, I feel like I've fallen for the sunk cost (effort) fallacy. This extension was made as a tiny speed improvement for the glBegin-glEnd pattern, which is completely obsolete in this day and age.

But it does still show the limitations of my idea. And I wouldn't try to stop you if you want to handle all the functions from that extension - I'd still admire that.

Well, we cannot have the same attribute multiple times on the same tag. And if we make a single tidy="glTexCoordColorVertex" attribute, extracting the type suffixes will require some kind of string diff algorithm, even if simplistic. So how about this then:

        <command>
            <proto>void <name>glTexCoord2fColor3fVertex3fSUN</name></proto>
            <param kind="Coord" tsuffix="2f"><ptype>GLfloat</ptype> <name>s</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>t</name></param>
            <param kind="Color" tsuffix="3f"><ptype>GLfloat</ptype> <name>r</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>g</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>b</name></param>
            <param kind="Coord" tsuffix="3f"><ptype>GLfloat</ptype> <name>x</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>y</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>z</name></param>
        </command>

Or, if lack of uniformity bothers you as much as it does me, maybe something more complicated:

        <command>
            <proto>void <name>glTexCoord2fColor3fVertex3fSUN</name></proto>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>s</name><decor tsuffix="2f"/></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>t</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>r</name><decor tsuffix="3f"/></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>g</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>b</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>x</name><decor tsuffix="3f"/></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>y</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>z</name></param>
        </command>

With <decor> tags possibly being used for more things in the future.

This looks like a flaw in naming the commands properly

Wait, are those 2 functions doing the exact same thing then?

NogginBops commented 9 months ago

Yeah I feel like this is getting very complicated, much more than it's worth. The old functions like glReplacementCodeuiColor3fVertex3fvSUN, glNormal3f, etc. should not and do not matter for anyone in the year 2023 and making a significantly more complicated solution for this arguably very small gain does not sit right with me.

Wait, are those 2 functions doing the exact same thing then?

glGetQueryBufferObjecti64v and glGetQueryBufferObjectui64v are not identical, they work by writing the buffer query result to a buffer directly. And the choice of function determines if the result is a signed integer or unsigned integer. Because it writes the result to the specified buffer it will not differ in it's function signature.

This is for me also another reason why this type of name mangling should be left to bindings libraries as they can decide what is an appropriate solution for their language and library.

Rytis-Stan commented 9 months ago

names like glReplacementCodeuiColor3fVertex3fvSUN

Right... That one extension...

In most cases, it shouldn't matter if it's an old extension, command, etc. The purpose of "gl.xml" and the other XML files is to specify all available versions of some API. If something is old an useless, then it shouldn't be specified. If it's left in, then it should be part of any XML improvements/changes.

Well, we cannot have the same attribute multiple times on the same tag. And if we make a single tidy="glTexCoordColorVertex" attribute, extracting the type suffixes will require some kind of string diff algorithm, even if simplistic. So how about this then:

        <command>
            <proto>void <name>glTexCoord2fColor3fVertex3fSUN</name></proto>
            <param kind="Coord" tsuffix="2f"><ptype>GLfloat</ptype> <name>s</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>t</name></param>
            <param kind="Color" tsuffix="3f"><ptype>GLfloat</ptype> <name>r</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>g</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>b</name></param>
            <param kind="Coord" tsuffix="3f"><ptype>GLfloat</ptype> <name>x</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>y</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>z</name></param>
        </command>

Or, if lack of uniformity bothers you as much as it does me, maybe something more complicated:

        <command>
            <proto>void <name>glTexCoord2fColor3fVertex3fSUN</name></proto>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>s</name><decor tsuffix="2f"/></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>t</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>r</name><decor tsuffix="3f"/></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>g</name></param>
            <param kind="Color"><ptype>GLfloat</ptype> <name>b</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>x</name><decor tsuffix="3f"/></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>y</name></param>
            <param kind="Coord"><ptype>GLfloat</ptype> <name>z</name></param>
        </command>

The first version of the specification looks better, I think. But both versions can cause issues in detecting the position of the specified tsuffix values inside the full command name. For example, in the name glMateriali, the tsuffix="i" might refer to one of two i characters. One way to handle this might be to go from the last character to the first, or something like that. But it's likely that there might be exceptions to that.

SunSerega commented 9 months ago

If something is old an useless, then it shouldn't be specified

That would be perfect. But backward compatibility is highly valued in this repo. And besides, this would require reaching multiple important people and having them confirm. So such an ideal cleanliness is most likely unreachable.

One way to handle this might be to go from the last character to the first, or something like that

Yeah, that's what I implied, but also you're right, it's not fully future-proof.

Rytis-Stan commented 9 months ago

Yeah I feel like this is getting very complicated, much more than it's worth. The old functions like glReplacementCodeuiColor3fVertex3fvSUN, glNormal3f, etc. should not and do not matter for anyone in the year 2023 and making a significantly more complicated solution for this arguably very small gain does not sit right with me.

I would also like to avoid making complicated solutions. But the fact that some old names are involved should be irrelevant. The specification XML file should be consistent for all of it's data as much as possible.

Wait, are those 2 functions doing the exact same thing then?

glGetQueryBufferObjecti64v and glGetQueryBufferObjectui64v are not identical, they work by writing the buffer query result to a buffer directly. And the choice of function determines if the result is a signed integer or unsigned integer. Because it writes the

Except that these commands are named incorrectly. All those i, f, i64v indicate the types of command parameters and not the semantics of how the functions themselves work. That should be part of the "main" name. The names should have been chosen in the same manner as glGetInteger64v. So glGetQueryBufferObjecti64v would be become glGetQueryBufferObjectInteger64, for example.

But, of course, that complicates matters when trying to choose a proper "nice" name for each command.

NogginBops commented 9 months ago

I would also like to avoid making complicated solutions. But the fact that some old names are involved should be irrelevant. The specification XML file should be consistent for all of it's data as much as possible.

This is too idealistic, the OpenGL api is very messy and has a ton of legacy. You could probably count the number of people using glReplacementCodeuiColor3fVertex3fvSUN on a single hand (if the number isn't 0). And there is zero new code that will use these functions. Wasting engineering time of every gl.xml consumer just because we needed to complicate the postfix stuff in gl.xml because of these kinds of functions is a bad idea.

I get the ideal, I really do, but I also think we need to see the reality of the situation and think about the issues of actual people using the specification. And no one, I repeat, no one is using these functions.

NogginBops commented 9 months ago

Except that these commands are named incorrectly. All those i, f, i64v indicate the types of command parameters and not the semantics of how the functions themselves work. That should be part of the "main" name. The names should have been chosen in the same manner as glGetInteger64v. So glGetQueryBufferObjecti64v would be become glGetQueryBufferObjectInteger64, for example.

Where does it say that the last part of the name is only for indicating the types of the command parameters? This design makes perfect sense from a c perspective. I agree that to be more consistent that it could be part of the "main" name, but they aren't. And because of backwards compatibility these names are not going to disappear. So we have to live with that.

oddhack commented 9 months ago

The XML schema does what we need it to do. OpenGL is not going to evolve in other than very minor, incremental ways at this point, and there is no upside for us in changing the schema for needs that are not what it was created for, with consequent disruption to the existing tooling and artifacts.

So if you require a different schema, please follow the open source process, fork the existing stuff, and do whatever you need to. It probably isn't hard to write a script to convert the C declarations, run the type-suffixed function names through a conversion function, etc. and run it over gl.xml every time it changes. If someone wrote such a script and that proved to have considerable interest, we would likely be willing to put it in a contrib/ directory. But we're not going to change the gl.xml that Khronos maintains in these ways.

I'm not going to close the issue myself, but I don't see a lot of point in your keeping it open, either.