Suggestion for UaDI template.h

R-Abbasi commented 9 months ago

A rather simplified version of template.h. Do you see any clear problem that I don't, please?

/**
 * @file UaDI_template.h
 * @brief This header file defines the API for interacting with various types of data producers.
 * @author Stephan Bökelmann
 * @email sboekelmann@ep1.rub.de

 * This file defines the API for interacting with various types of data
 * producers. It includes functions for initializing the library, enumerating
 * available data producers, claiming and releasing devices, managing data 
 * chunks, and waiting for data. Detailed error codes and data management 
 * policies are provided for robust integration.

#pragma once

 * @brief Handle for the connection-id to the library instance.
 * @see uadi_init(...)
 * @see uadi_deinit(...)
 *
 * This is a void pointer to an instance that implements the DLL interface.
 * The consumer is responsible for allocating and deallocating memory for this
 * handle. A valid handle can be obtained by calling uadi_init(...) with an a 
 * pointer to this memory location.
 */

typedef void* uadi_lib_handle;

/** 
 * @brief Handle for a device instance.
 * @see uadi_claim_device(...)
 * @see uadi_release_device(...)
 *
 * This pointer points to an instance that implements the interface of an 
 * abstract data producer. The library is responsible for managing the 
 * lifecycle of this handle. It gets created by claiming a device and gets 
 * destroyed after a device has been released by the consumer. If the device 
 * still holds data when it is being released, it will trigger the receive 
 * callback, until the data is consumed.
 */

typedef void* uadi_device_handle;

/** 
 * @brief Pointer to a chunk of memory.
 * @see uadi_init(...)
 * @see uadi_push_chunks(...)
 *
 * A chunk of memory in the terminology of UaDI is a already allocated piece of 
 * memory, that gets created and destroyed by the consumer. It is meant to be 
 * the container of any information that is larger than a status code. It gets 
 * passed to the library, by handing over a pointer to the already allocated 
 * piece as well as the number of chunks that are contiguously allocated after 
 * the pointer. The library is supposed to handle chunks in a manner, that not 
 * all chunks have to be allocated contiguously, but multiple chunks can only 
 * be given to the library at once, if they are allocated contiguously. 
 *
 * Each UaD-Library is allowed to define their chunk size. Thus a initialization protocol is needed. The consumer is responsible 
 * for allocating and deallocating these chunks. These chunks are passed to the 
 * library via the uadi_push_chunks(...) function.
 */

typedef unsigned char* uadi_chunk_ptr;
#define UADI_DEFAULT_CHUNK_SIZE 128 * 1024  // can be removed too

/**
 * @brief Status code for uadi_receive_callback.
 * @see uadi_receive_struct
 * This status code is part of the uadi_receive_struct.
 * The consumer is responsible for checking the status code, befor handling the 
 * pointers to the received data.
 * Status Codes:
 * - UADI_SUCCESS: 1
 * - UADI_ERROR: -1
 */

typedef int uadi_status;

/** 
 * @brief Structure to receive data from the library.
 * @see uadi_receive_callback(...)
 *
 * This structure is used when receiving chunks of data from the library.
 * It contains pointers to information and data packets. The format of data 
 * packets is an array of floats. Information packets are JSON strings.
 */

struct uadi_receive_struct{
    uadi_chunk_ptr infopack_ptr;
    uadi_chunk_ptr datapack_ptr;
    uadi_status status;
};

// Macro error codes? Where are they used, in "template.cpp"? Then why not use an "enum" there? Simply tell me, please. 

/**
 * @brief Callback function for receiving data from the library.
 * @see uadi_receive_struct(...)
 * @see uadi_register_receive_callback(...)
 * This function is defined by the consumer, and called from the library when 
 * data is available. The function needs to be implemented by the consumer in a 
 * way, that it can handle a pointer to the received data. The void pointer is 
 * used by the consumer to provide context for the function. It might be a
 * pointer to a queue for example.
 */

typedef void(*uadi_receive_callback)(uadi_receive_struct*, void*);

/**
* @brief Callback function for recycling unused chunks.
* @see uadi_release_device(...)
* This function is defined by the consumer, and called from the library in 
* order to recycle unused chunks back to the consumer. The function needs to
* know the context, therefor it'll also take a void* to the consumers context.
* Eventhough the context may be the same as the pointer for the 
* receive_callback, it can be used separately. 
*/

typedef void(*uadi_recycle_unused_chunk_callback)(uadi_chunk_ptr, size_t, void*);

/**
 * @brief Initialize the library and fills a preallocated empty handle with an actual library handle.
 * @param lib_handle Pointer to the preallocated library handle.
 * @return uadi_status Status code of the operation.
 * @see uadi_deinit(...)
 * The library handle is used internally by the library. To keep track of the 
 * connection. This way, the library can handle multiple connections from 
 * different consumers. 
 * The consumer needs to keep the library handle and use it with other calls, 
 * as long as there hasn't been a device claimed. As soon as a device has been 
 * claimed, the device handle implicitly also holds the library handle.
 * It needs to be made sure, that after a device has been released, the library 
 * handle is still valid, until the consumer calls uadi_deinit(...). This has 
 * to be done in order to keep RAII intact.
 */

  extern "C" {  // C interface - Do not mangle the symbols 

  DLL_EXPORT uadi_status uadi_init(uadi_lib_handle* lib_handle);

/**
 * @brief This function fills a preallocated chunk of memory with JSON-formatted meta-data from the library itself. 
 * @param lib_handle Pointer to the library handle.
 * @param meta_data Pointer to the preallocated memory for the meta-string.
 * @param meta_data_size Size of the preallocated memory for the meta-string.
 * @return uadi_status Status code of the operation.
 * Meta-data can include all kinds of data, such as device information, version 
 * information, etc.
 * It shall not exceed 128KB in size, even though it is not enforced by the 
 * library. One could potentially have a longer JSON string than this and the 
 * call would fail with UADI_BUFFER_TOO_SMALL. In that case, the consumer would 
 * have to call the function again with a larger chunk of memory.
 * A consumer is not required to call this function.
 */

  DLL_EXPORT uadi_status uadi_get_meta_data(uadi_lib_handle lib_handle, char* meta_data,
                                            size_t meta_data_size);
/**
 * @brief This function enumerates all available data producer devices.
 * @param lib_handle Pointer to the library handle.
 * @param device_list Pointer to preallocated charbuffer where device list shall be stored.
 * @param device_list_size Size of the preallocated charbuffer.
 * @return uadi_status Status code of the operation.
 * @see uadi_claim_device(...)
 * @see uadi_release_device(...)
 * The library is viewed as the producer, anyhow, the producer may include 
 * several devices. The consumer needs to be aware of these devices and claim 
 * one to receive its data. A device is claimed exclusively, meaning, that only 
 * one consumer at a time can claim it. The received device list is a 
 * JSON-formatted string, containing all available devices.
 */

  DLL_EXPORT uadi_status uadi_enumerate(uadi_lib_handle handle, char* device_list,
                                        size_t device_list_size);
/**
 * @brief This function claims a data producer device.
 * @param lib_handle Pointer to the library handle.
 * @param device_handle Pointer to the preallocated empty device handle.
 * @param device_key Pointer to a zero-terminated array of characters containing the device key.
 * @param receive_callback Pointer to the callback function.
 * @param receive_context Pointer to the consumers context.
 * @param recycle_callback Pointer to the recycle callback function.
 * @param recycle_context Pointer to the consumers context.
 * @param chunk_array Pointer to the preallocated chunk array.
 * @param chunk_count Number of chunks in the chunk array.
 * @return uadi_status Status code of the operation.
 * @see uadi_enumerate(...)
 * @see uadi_release_device(...)
 * @see push_chunks(...)
 * This function is the heart of the measurement process.
 * It is used by the consumer to properly claim and set up a device.
 * In order for the device to function, it needs memory to store received data 
 * from the device, as well as a routine from the consumer, that is called when 
 * new data is available.
 * The consumer needs to keep the device handle and use it with other calls. 
 * The device handle is implicitly also holds the library handle.
 * The device handle is an exclusive handle, meaning, that only one consumer at 
 * a time can claim it. Leaking the handle will result in a loss of the claimed 
 * device.
 * The callback function is called whenever a new chunk from the device is 
 * available. A device can't be released as long as there is available data 
 * from the device. The release function will stop the new acquisition of data, 
 * but will make sure, that the callback function is called with all available 
 * data.
 * A device may also give back unused chunks to the consumer, this is done
 * by using the recycle function.
 * The user data pointer is used by the consumer to provide context for the 
 * function. It might be a pointer to a queue for example.
 */

  DLL_EXPORT uadi_status uadi_claim_device(uadi_lib_handle lib_handle, uadi_device_handle* device_handle, 
                                           char const* device_key, 
                                           uadi_receive_callback receive_callback, 
                                           void* receive_context,
                                           uadi_recycle_unused_chunk_callback recycle_callback,
                                           void* recycle_context,
                                           uadi_chunk_ptr* chunk_array, 
                                           size_t chunk_count);
/**
 * @brief This function is used to push chunks of memory to a device.
 * @param device_handle Pointer to the device handle.
 * @param chunk_array Pointer to the preallocated chunk array.
 * @param chunk_count Number of chunks in the chunk array.
 * @return uadi_status Status code of the operation.
 * The push chunks function will hand over chunks of memory to a device inside 
 * the library. Any data that is stored in the chunk will be overwritten by the 
 * device.
 */
  DLL_EXPORT uadi_status uadi_push_chunks(uadi_device_handle device_handle, 
                                          uadi_chunk_ptr* chunk_array, 
                                          size_t chunk_count);
/**
 * @brief This function sends a JSON-formatted string to a device.
 * @param device_handle the device handle.
 * @param chunk_ptr Pointer to a JSON-filled chunk of memory.
 * @return uadi_status Status code of the operation.
 * This function can be used to send control data to a device. 
 * It is not part of the generic interface, which control data is allowed.
 * If a device is attached that doesn't support any control data, this function
 * will return UADI_NOT_SUPPORTED.
 */

  DLL_EXPORT uadi_status uadi_send_json(uadi_device_handle device_handle, uadi_chunk_ptr chunk_ptr);

/**
 * @brief This function releases a device.
 * @param device_handle Pointer to the device handle.
 * @return uadi_status Status code of the operation.
 * @see uadi_register_receive_callback(...)
 * @see uadi_claim_device(...)
 * After a consumer is done with the device, it has to release it. The release 
 * function will stop the acquisition of new data from the device and will make 
 * sure, that the callback function is called with all remaining chunks in the 
 * devices queue. Empty chunks will be propagated back to the consumer as info-
 * packs, containing nothing but a terminating zero.
 */

  DLL_EXPORT uadi_status uadi_release_device(uadi_device_handle device_handle);

/**
 * @brief This function deinitializes the library.
 * @param lib_handle Pointer to the library handle.
 * @return uadi_status Status code of the operation.
 * @see uadi_init(...)
 * After the library is deinitialized, it is no longer usable.
 */

  DLL_EXPORT uadi_status uadi_deinit(uadi_lib_handle lib_handle);
} // extern "C"

MaxClerkwell commented 8 months ago

1) Interoperability

#ifdef _WIN32
#define DLL_EXPORT __declspec(dllexport)
#else
#define DLL_EXPORT
#endif

#ifdef __cplusplus
extern "C" {
#endif

is needed to have a single .h-File that can be used for Windows and Linux

2) #define UADI_DEFAULT_CHUNK_SIZE 128 * 1024 This should not be removed, since it is more a form of documenting what the default chunk size should be. This can be changed by someone writing their own library, but it stays for documentation reasons

3) name mangling and extern "C" Using extern "C" in a DLL header file, especially when the DLL is intended to be used with different programming languages, is crucial for ensuring compatibility and ease of integration. The primary purpose of extern "C" is to disable name-mangling.

Name-Mangling in C++: C++ compilers perform name-mangling, which is the modification of function names into unique names so that the linker can distinguish between different functions (especially those with the same name but different parameters). This process allows C++ to support features like function overloading and namespaces. However, the mangled names are often unreadable and vary between different compilers.
Consistency Across Compilers: Name-mangling is not standardized and can differ between compilers. This means a function named foo in your source code might be mangled differently by different C++ compilers, leading to inconsistencies when trying to link with the DLL.
Compatibility with C and Other Languages: Most other programming languages, including C, do not support name-mangling. When a DLL function is exported with mangled names, it becomes difficult or impossible for these languages to call these functions directly. By using extern "C", the function names are exported in a C-style (non-mangled) format, making it easier for other languages to interface with the DLL.

4) #define vs enum Basically related to answer 3). When creating a DLL intended for use across multiple platforms and with different programming languages like Python, C, C++, Rust, and Fortran, it's often advisable to use #define for integer constants instead of enum.

Cross-Language Compatibility: Different programming languages have different ways of handling enumerations (enum). For instance, Python and Fortran do not natively understand C-style enum. When you use #define to create integer constants, these constants are easily mapped across languages since they are essentially integers, a universal data type.
Compiler and Platform Variability: The representation of enum types is not strictly defined in C and can vary between compilers and platforms. This variability can lead to inconsistencies in the size and underlying type of the enum, which can cause issues when a DLL is used in environments with different compilers or on different platforms. On the other hand, integer constants defined with #define are more predictable and consistent across platforms and compilers.
Binary Compatibility: When exchanging data between different modules, such as a DLL and an application written in another language, having a consistent, predictable data type is crucial. #define integer constants ensure that the size and format of the data remain consistent, which is not always guaranteed with enum due to potential variations in size and representation.
Interfacing with Standard Libraries: Standard libraries in various languages, including C and C++, often use integer constants for error codes and similar purposes. Aligning our DLL's interface with this convention can make it easier to integrate with existing codebases and libraries.

While enum can be a clean and type-safe way to define constants in a single-language environment, the cross-language, cross-platform nature of DLLs used in our ecosystems makes #define integer constants a more practical and robust choice. This approach maximizes compatibility, minimizes potential issues related to compiler and language differences, and provides a straightforward interface for developers across various programming languages, especially python and FORTRAN.

Any additions @odinthenerd?

R-Abbasi commented 8 months ago

Thanks for the comments.

1) That's right - not sure why I forgot that.

2) It's possible to implement the new version in C++ (e.g., UaDI.cpp ), create a shared library for it, then load and use its exported symbols in a cross-platform and cross-language manner, I guess. I thought this was/is the purpose. However, it's also possible to offer the interface for implementation and usage not in C++ but other languages as well for which the current version makes more sense. Is it the purpose, please?

3) I wrote extern "C" { as a required part with the comment on what it does, // C interface - Do not mangle the symbols. Sorry it has caused a misunderstanding, apparently.

odinthenerd commented 8 months ago

C++ does not have a standardized ABI and there are few languages which can consume C++ mangled names or data layouts as they are "implementation defined" in the standard so it only works if both languages are using the same underlying compiler (rust or D can talk to C++ directly but only when the same compiler compiled both). Therefore the entire uadi interface should be based on the C ABI. It would otherwise be quite hard to interface to python for example.

odinthenerd commented 8 months ago

One final change I would suggest would be adding a data_length variable to the uadi_receive_struct. This should make implementation of higher level layers of communication simpler as they would otherwise all have to implement a length.

struct uadi_receive_struct{
    uadi_chunk_ptr infopack_ptr;
    uadi_chunk_ptr datapack_ptr;
    size_t data_length;
    uadi_status status;
};

R-Abbasi commented 8 months ago

In 2), by "... to implement the new version in C++ ...", I meant the implementation in the sense of making definitions for declarations.

interface (UaDI.h) in C
implementation (UaDI.cpp) in C++ - that is, using C++ to make definitions for the functions declared in the interface

skunkforce / unified-abstract-dataproducer-template

Suggestion for UaDI template.h #10