twinbasic / lang-design

Language Design for twinBASIC
MIT License
11 stars 1 forks source link

Allow imports of .h files for external declarations #39

Open bclothier opened 2 years ago

bclothier commented 2 years ago

To make tB easier to use with external APIs, especially those written in C/C++ and are not COM-enabled, it would be nice to be able to add a .h file as a reference which can be then translated into an equivalent of a named module with Declare, Enum, and Const statements. Alternatively there can be an IDE feature to auto-generate a tB module based on a given .h file & added metainformation (e.g. inputs for the Lib clause of the Declare statement).

That would eliminate the need for tedious translating from C syntax to tB syntax especially if you want to import in a large API from some other C/C++ project. Such support would make it easier for tB codebases to leverage any C/C++ libraries with the minimal effort required for maintenance. Only the C syntax needs be supported since exported functions from a C++ project would be required to be wrapped in an extern "C" block anyway.

In case where there are multiple translations possible (a good example would be the RtlCopyMemory Win32 method), the autogeneration should default to the most broad interpretation possible (e.g. provide Any for the 1st and 2nd parameters) since any other translations are usually more specialized and best done by hand anyway rather than autogenerated.

WaynePhillipsEA commented 2 years ago

The main problem I see is that most C/C++ header files are not self-contained, and by that I mean that they often rely on many levels deep #includes of other header files. When it comes to the Windows SDK headers, they also make heavy use of C macros, and so to properly interpret the header files basically requires a full C compiler.

A further issue is that DLL declarations in header files are only symbols... they themselves don't tell you which DLL library they are to be found in (unlike in VBx which contains the function signature and DLL entry point information on the same line). In C/C++ it is the linked-in .LIB files that contains the information on how a symbol refers to a specific DLL entry point. So not only do you need to have a C compiler, but also something that understands LIB files (i.e. a linker).

I actually do have a C/C++ compiler and linker that I wrote many years ago for another project, and it's capable of fully interpreting the Windows SDK. So whilst this is a difficult proposition; it's actually quite possible we could do something like this.

bclothier commented 2 years ago

I see. TBH, I was envisioning something simpler... treat a .h file as if it was self contained by asking the user to provide the DLL name for the .h file since we neef it anyway for the Lib clause of the ’Declare` statements. That way, it can be easily represented as a single module, making it easier to map between the external declarations and the source header files. However, I hadn’t considered the macros gumming up the parsing of the symbols.

WaynePhillipsEA commented 2 years ago

Most header files can't be taken in isolation. It's not just about the preprocessor macros... you also need to know about all those fancy typedef's and structs that are defined in those #include'd header files. You really do need a C/C++ compiler to be able to interpret header files properly. Don't take my word for it; have a look inside the Windows SDK header files... if you dare!

DaveInCaz commented 2 years ago

This could make a useful standalone tool, capable of outputting Declare or its equivalents in many languages.

wqweto commented 2 years ago

There is one big obstacle with header files and typelibs as source of metadata about FFI for both VBx and TB and that is support for Unions.

It’s unclear to me if Unions are not supported by oleaut but most of win32 API uses these heavily and the available win32 metadata project on github includes these too.

I was fiddling with conversion tool from the published win32 metadata to a big fat typelib but although IDL supports Unions its VBx which fails to recognize these as supported types.

Adding support for Unions from oleaut in TB will enhance FFI story be it through source .h or .idl files or compiled typelibs.

bclothier commented 2 years ago

Regarding resolving the multiple level of indirections, my concern is that importing a single header file could potentially generate excessive amount of codegen to resolve all macros, nested typedefs, preprocessor definitions and whatever. The main rationale here is to make it easy to update the declarations whenever a new version of the C/C++ project is shipped. It’s more likely that those things will come from other header files which may be shared among different APIs, particularly among the Win32 API.

That’s one of the reasons I thought that treating a single header file as if it was self contained would be a better way of managing the versions of the header file. If it requires manual input to resolve the external dependencies, I would be fine with that.

If we enable twinbasic/lang-design#32, this can complicate the handling of the versions especially if the typedefs are in other header files whether we directly imported or not. This implies tracking the definitions, rather than tracking the header files, which would increase the complexity of the tool.

Regarding the unions, this would not be Automation compatible to expose unions, even though COM can handle unions. A compiler warning should be provided if we end up exposing the unions via our output library. If we are authoring our library for use in VBA, we will have to come up with a translation schema so VBA can use whatever functionality that requires a use of the unions,

wqweto commented 2 years ago

I'm not advocating producing unions on external typelibs but consuming these when defined in a typelib, just like VBx can call API functions in a module in a typelib or use string constants in a typelib but cannot produce these in a typelib of its own (in VB6).

mwolfe02 commented 2 years ago

I actually do have a C/C++ compiler and linker that I wrote many years ago for another project,

I love that Wayne just happens to have a custom C/C++ compiler and linker laying around...

bclothier commented 2 years ago

Just to post an alternative that I just found --- apparently someone else had that problem as well and wrote a software to do just that. Unfortunately BASIC isn't among the list but given that it's a older software with long development, it might be an preferable alternative to use the SWIG and either provide a VBx/tB conversion mapping to use with SWIG or at very least extract the parse tree as XML which can be then easily transformed into compatible declarations.

http://www.swig.org/exec.html

fafalone commented 2 years ago

As someone who works with a lot of APIs too this has been a tremendous pain point. I've written simple parsers to at least get typedefs and enums from headers into VB syntax (I even posted one here targeted at the similar ones in IDLs), but it seems every time I'm done accounting for things in the current ones, the next one has something new in the header that breaks the parser.

And don't even get me started about tracking down what DLLs things are in. Though I'm mostly going the other way; starting with a DLL, trying to track it back to it's source.

FullValueRider commented 2 years ago

Would this help

https://www.tangiblesoftwaresolutions.com/product_details/cplusplus_to_vb_converter_details.html

bclothier commented 2 years ago

Not sure... it looks like it's just a code converter, whereas we need to build a Declare and UDTs from the .h file. Also, it's actually a VB.NET converter. Not very helpful.

FullValueRider commented 2 years ago

This is what you get for the SafeArray struct, which might be a set of useful hints.

Public Class tagSAFEARRAY
  Public cDims As UShort
  Public fFeatures As UShort
  Public cbElements As UInteger
  Public cLocks As UInteger
  Public pvData As Object
  Public rgsabound() As SAFEARRAYBOUND = Arrays.InitializeWithDefaultInstances(Of SAFEARRAYBOUND)(1)
End Class

So possibly not totally useless. And there is a free version.