Open R-Abbasi opened 9 months ago
1) Interoperability
#ifdef _WIN32
#define DLL_EXPORT __declspec(dllexport)
#else
#define DLL_EXPORT
#endif
#ifdef __cplusplus
extern "C" {
#endif
is needed to have a single .h
-File that can be used for Windows and Linux
2) #define UADI_DEFAULT_CHUNK_SIZE 128 * 1024
This should not be removed, since it is more a form of documenting what the default chunk size should be. This can be changed by someone writing their own library, but it stays for documentation reasons
3) name mangling and extern "C"
Using extern "C"
in a DLL header file, especially when the DLL is intended to be used with different programming languages, is crucial for ensuring compatibility and ease of integration. The primary purpose of extern "C"
is to disable name-mangling.
Name-Mangling in C++: C++ compilers perform name-mangling, which is the modification of function names into unique names so that the linker can distinguish between different functions (especially those with the same name but different parameters). This process allows C++ to support features like function overloading and namespaces. However, the mangled names are often unreadable and vary between different compilers.
Consistency Across Compilers: Name-mangling is not standardized and can differ between compilers. This means a function named foo in your source code might be mangled differently by different C++ compilers, leading to inconsistencies when trying to link with the DLL.
Compatibility with C and Other Languages:
Most other programming languages, including C, do not support name-mangling. When a DLL function is exported with mangled names, it becomes difficult or impossible for these languages to call these functions directly. By using extern "C"
, the function names are exported in a C-style (non-mangled) format, making it easier for other languages to interface with the DLL.
4) #define
vs enum
Basically related to answer 3). When creating a DLL intended for use across multiple platforms and with different programming languages like Python, C, C++, Rust, and Fortran, it's often advisable to use #define
for integer constants instead of enum
.
Cross-Language Compatibility:
Different programming languages have different ways of handling enumerations (enum). For instance, Python and Fortran do not natively understand C-style enum
. When you use #define
to create integer constants, these constants are easily mapped across languages since they are essentially integers, a universal data type.
Compiler and Platform Variability: The representation of enum types is not strictly defined in C and can vary between compilers and platforms. This variability can lead to inconsistencies in the size and underlying type of the enum, which can cause issues when a DLL is used in environments with different compilers or on different platforms. On the other hand, integer constants defined with #define
are more predictable and consistent across platforms and compilers.
Binary Compatibility:
When exchanging data between different modules, such as a DLL and an application written in another language, having a consistent, predictable data type is crucial. #define
integer constants ensure that the size and format of the data remain consistent, which is not always guaranteed with enum due to potential variations in size and representation.
Interfacing with Standard Libraries: Standard libraries in various languages, including C and C++, often use integer constants for error codes and similar purposes. Aligning our DLL's interface with this convention can make it easier to integrate with existing codebases and libraries.
While enum can be a clean and type-safe way to define constants in a single-language environment, the cross-language, cross-platform nature of DLLs used in our ecosystems makes #define
integer constants a more practical and robust choice. This approach maximizes compatibility, minimizes potential issues related to compiler and language differences, and provides a straightforward interface for developers across various programming languages, especially python
and FORTRAN
.
Any additions @odinthenerd?
Thanks for the comments.
1) That's right - not sure why I forgot that.
2) It's possible to implement the new version in C++ (e.g., UaDI.cpp
), create a shared library for it, then load and use its exported symbols in a cross-platform and cross-language manner, I guess. I thought this was/is the purpose.
However, it's also possible to offer the interface for implementation and usage not in C++ but other languages as well for which the current version makes more sense. Is it the purpose, please?
3) I wrote extern "C" {
as a required part with the comment on what it does, // C interface - Do not mangle the symbols
. Sorry it has caused a misunderstanding, apparently.
C++ does not have a standardized ABI and there are few languages which can consume C++ mangled names or data layouts as they are "implementation defined" in the standard so it only works if both languages are using the same underlying compiler (rust or D can talk to C++ directly but only when the same compiler compiled both). Therefore the entire uadi interface should be based on the C ABI. It would otherwise be quite hard to interface to python for example.
One final change I would suggest would be adding a data_length variable to the uadi_receive_struct. This should make implementation of higher level layers of communication simpler as they would otherwise all have to implement a length.
struct uadi_receive_struct{
uadi_chunk_ptr infopack_ptr;
uadi_chunk_ptr datapack_ptr;
size_t data_length;
uadi_status status;
};
In 2), by "... to implement the new version in C++ ...", I meant the implementation in the sense of making definitions for declarations.
UaDI.h
) in CUaDI.cpp
) in C++ - that is, using C++ to make definitions for the functions declared in the interface
A rather simplified version of
template.h
. Do you see any clear problem that I don't, please?