kcat / alure

Alure is a utility library for OpenAL, providing a C++ API and managing common tasks that include file loading, caching, and streaming
zlib License
70 stars 20 forks source link

What is going on with the pimpl implementation ? #15

Closed Cazadorro closed 5 years ago

Cazadorro commented 6 years ago

I'm quite confused as to why pimpl is implemented as it is, ideally implementation classes of your interface class are declared privately to the interface class, and defined in the same cpp file as the interface class, or defined in separate h/cpp pairs. If we look for the corresponding Buffer for BufferImpl, I see it used, but it gets hard to find where it is actually declared in the repo. Often I see Impls defined with completely different objects which aren't even pimpl, defeating the purpose of using this pattern. Something more along the lines of this is what I was expecting. I do see some impls being used directly, I assume for performance reasons, in internal implementation, but I would expect to see double the number of source files in a typical implementation if one wanted to keep performance access to the implementation, and still doesn't explain most of the patterns I'm seeing.

What is the rational for this?

kcat commented 6 years ago

If we look for the corresponding Buffer for BufferImpl, I see it used, but it gets hard to find where it is actually declared in the repo.

The implementation class is declared in buffer.h, with the methods defined in buffer.cpp. Isn't this the typical pattern, a class declared in xyz.h that others can use and the non-inlined methods defined in xyz.cpp?

It's implemented as it is because various classes interact with each other, so need to know about how they work (e.g. Sources need to know the Buffer and AuxiliaryEffectSlot IDs, while SourceGroups and Sources need to know details about each other, and of course the Context needs to know how to build and manage them).

I do see some impls being used directly, I assume for performance reasons, in internal implementation, but I would expect to see double the number of source files in a typical implementation if one wanted to keep performance access to the implementation, and still doesn't explain most of the patterns I'm seeing.

I'm not sure what's confusing. The library's public classes are declared in the library header (alure2.h), while the implementation class is declared in its own internal header (buffer.h for BufferImpl, source.h for SourceImpl, etc) with the non-inline functions defined in the corresponding source file. The only real exception currently is the listener, which is declared and defined with the context due to historical reasons from before pImpl was used.

Cazadorro commented 6 years ago

The implementation class is declared in buffer.h, with the methods defined in buffer.cpp. Isn't this the typical pattern, a class declared in xyz.h that others can use and the non-inlined methods defined in xyz.cpp?

I tried to use Githubs repo search functionality and it failed, when searching Buffer it didn't look through alure2.h for some reason.... That being said, I do not believe the buffer interface should be fully defined in alure2.h, in fact I believe too much is going on in alure2.h. Following the currently set up pattern (entirely split interface and impl, not just impl defined in source) normally you would have a buffer.h, buffer.cpp, and then a bufferimpl.h and bufferimpl.cpp. If you want to find, or edit the interface and some one else is editing something completely different but in the same file, you are just going to run into a mess of merge conflicts.

The library's public classes are declared in the library header (alure2.h),

Normally, to keep the functionality of alure2.h, all alure2.h would be is a file of includes (and maybe some other light definitions like type aliases or forward declarations). GLM uses this pattern for example, as do many other projects, they don't have actual interfaces, implementation, or any dependencies for implementation defined there though.

From an outsiders perspective, where things are, and what to import is not obvious if you are developing for this project. Of course a lot of this stuff is fine to have from a single import if you are simply developing with the project.

The only real exception currently is the listener, which is declared and defined with the context due to historical reasons from before pImpl was used.

Ok, that makes sense, and I see batcher actually has to do with context so that makes sense, however things like #define F_PI and enum class AL seem a bit odd, I think there is already a macro for floating point pi and from the name of the enum class, I have no idea what it is supposed to be.

Of course, I'm not just complaining, I'm willing to do the kinds of refactors I'm proposing :D

kcat commented 6 years ago

Following the currently set up pattern (entirely split interface and impl, not just impl defined in source) normally you would have a buffer.h, buffer.cpp, and then a bufferimpl.h and bufferimpl.cpp.

I don't believe it's worth separating the definitions of the interface thunks from the implementation. The majority of the thunks are literally just two instructions (mov (%rdi),%rdi then jmp ... to the implementation function's address), so its best to be right before the address it jumps to to avoid icache pollution. Smaller implementation functions will actually inline into the thunk, despite not normally being a candidate for inlining, simply because the compiler can see the function body it would call to. Unfortunately neither GCC or Clang (at least versions I've tried of them) will reorder function bodies for efficient cache use in these cases, and link-time optimization is still in its infancy, so the functions need to be explicitly defined in the desired order for that to work.

Putting the interface declarations into their own header files is something more worth considering, though. Just be aware that they normally install into the include/AL directory with OpenAL headers, so they should be prefixed like alure2-whatever.h to make it clear what they belong to.

For the implementations, naming the files like bufferimpl.h and bufferimpl.cpp looks kind of ugly to me, especially since every source and private header file (except the decoders) would get the impl which makes it redundant.

Ok, that makes sense, and I see batcher actually has to do with context so that makes sense, however things like #define F_PI and enum class AL seem a bit odd, I think there is already a macro for floating point pi and from the name of the enum class, I have no idea what it is supposed to be.

There actually isn't a macro for pi as a single-precision float. There's M_PI from C, but that's double-precision (which means using it would promote the calculated value to a double). It would need to be casted to make it a float, which the F_PI macro implicitly does. I'm actually kind of surprised C++ doesn't seem to have its own STL-based method to specifying pi like it does for other values (like in numeric_limits), which would allow for getting it as a float, double, or long double type.

enum class AL is certainly a non-obvious name for what it is. I just wanted to keep it concise so that when the code checks for an extension, it's something like context->hasExtension(AL::SOFT_source_resampler); to check for the AL_SOFT_source_resampler extension, rather than a more verbose context->hasExtension(ALExtension::SOFT_source_resampler); call.

Cazadorro commented 6 years ago

I don't believe it's worth separating the definitions of the interface thunks from the implementation. ...

Oops, the interface shouldn't have implementation at all honestly, I agree with not separating the interface thunks from implementation, though I'm not convinced it would actually have a performance regardless.

Putting the interface declarations into their own header files is something more worth considering, though. Just be aware that they normally install into the include/AL directory with OpenAL headers, so they should be prefixed like alure2-whatever.h to make it clear what they belong to.

Huh, I didn't know that I'll have to keep that in mind.

For the implementations, naming the files like bufferimpl.h and bufferimpl.cpp looks kind of ugly to me, especially since every source and private header file (except the decoders) would get the impl which makes it redundant.

I just want there to be a way to differentiate between pimpl implementation sources and the actual interface file. If I follow the "alure_[interface_name].h" model then it might not matter, since presumably they won't have a need to have the alure prefix. It is slightly confusing though to go to buffer.h and not find Buffer.

It would need to be casted to make it a float, which the F_PI macro implicitly does

CUDA's math.h has this so I just assumed normal C++ had it, but yeah I can't find it anywhere. However, I still feel that this definition belongs somewhere else (util constants, or just in some util header) as it seems kind of ubiquitous (it may have a useful context outside of that file).

enum class AL is certainly a non-obvious name for what it is. I just wanted to keep it concise so that when the code checks for an extension, it's something like context->hasExtension(AL::SOFT_source_resampler); to check for the AL_SOFT_source_resampler extension, rather than a more verbose context->hasExtension(ALExtension::SOFT_source_resampler); call.

Ok, that makes sense however there are some extra options in between AL and ALExtensions.

I would personally rather see just ALExtension be used, because there are many other OpenAL values that could be enumerated the same way. It makes it clear what it is, when you program you don't have to do translation in your head what it stands for.

However since there are more patterns like this with OpenAL usage, I would just define a namespace AL, then put enum class Extension in that namespace. I see this same technique was used for a group of ALC enums, and you use the macros directly for AL and ALC errors, ALC_APIENTRY, and a tonne of other macros specifically from al or alc headers.

you would do AL::Extension::..... for AL and ALC::Extension for ALC extensions. You might have to make enumerations longer, but now you can have so much more attached to the name AL and it still be clear to understand.

kcat commented 6 years ago

Oops, the interface shouldn't have implementation at all honestly, I agree with not separating the interface thunks from implementation, though I'm not convinced it would actually have a performance regardless.

I really wish C++ could properly separate interface from implementation, without requiring the use of thunks and wrapper classes. But this pimpl implementation seems to be the way to do it with the least amount of overhead.

However, I still feel that this definition belongs somewhere else (util constants, or just in some util header) as it seems kind of ubiquitous (it may have a useful context outside of that file).

If there's enough related stuff to have a util header, that would make sense to to. I'd just prefer to avoid creating a header for something that's just one or two lines. Currently, F_PI is only used in source.cpp so it could be defined there. Otherwise, it would probably make sense to move into main.h as a miscellaneous thing.

Cazadorro commented 5 years ago

issues resolved in comments