haxscramper / hcparse

High-level nim bindings for parsing C/C++ code
https://haxscramper.github.io/hcparse-doc/src/hcparse/libclang.html
Apache License 2.0
37 stars 2 forks source link

Make wrapper maintainable and updatable as newer versions come #9

Open haxscramper opened 2 years ago

haxscramper commented 2 years ago

If I'm aiming for full automation of the wrapper generation, I also need to figure out a way to account for library updates. My current idea for implementation (open for discussion):

And I can generate additional pragma annotations on the wrapped procs, like

proc needsRefresh(git_commit_graph_file: ptr cgraph, path: cstring) {. 
    interopSince: (1, 2, 0) < libgitVersion, dynlib: libgitSo, importc: "git_commit_graph_needs_refresh"  .}

where libgitVersion is defined as

const
  libgitMajor {.intdefine.}: int = 1
  libgitMinor {.intdefine.}: int = 2
  libgitPatch {.intdefine.}: int = 11
  libgitVersion = (libgitMajor, libgitMinor, libgitPatch)
  libgitSo = getLibgitSoAtCompiletime()

I already have the whole library API in the form of IR that can be saved in json form (and actually intended to be saved). I can annotate things with interopSince: (0, 0, 0) by default. When the user re-generates wrappers, I read back generated sources. If something changed, I add new interopSince version.

Alternatively, if you have your .nim file and generated it again, it will update bindings and not generate everything anew. This might require a little more verbose annotations, like

proc needsRefresh(git_commit_graph_file: ptr cgraph, path: cstring) {. 
    interopSince: (1, 2, 0) < libgitVersion, dynlib: libgitSo, importc: "git_commit_graph_needs_refresh",
    cgenOf: "git_commit_graph_needs_refresh(ptr[cgraph], ptr[cstring]): bool"
  .}

So I could update wrappers even if there is no json IR saved, or if user want to add custom code directly to the wrapper.

haxscramper commented 2 years ago

All declared procedures go though the helper macro that conditionally adds dynlib macro. It can also conditionally enable certain elements of the API or hide them as needed. (Maybe map to .error. that would show needed version).

haxscramper commented 2 years ago

It's a user responsibility to provide correct version information, either from the build system, or from CLI (all via passed defines). Wrappers have default values for the defines, so wrapper should work even without versioning data.

haxscramper commented 2 years ago

It is possible to hide API elements from the exported module (which corresponds to the header files), but sometimes modules themselves are renamed or deleted - this would cause accumulation of changes over time (module paths from each processed version would be accumulated over time, even though after certain library versions they would no longer be available). For example, if file old.h defined a struct and then moved (just moved, not edited) it to new.h I would have two copies of the same struct - one in old.nim and one in new.nim. Same applies to all functions and procedures moved between files.

But, this would allow wrapper users to support any version they want using

when <one version range>:
  import old # annotated with old "header"
  export old
else:
  import new # annotated with new "header"
  export new

It might be possible to provide predefined helper modules like this - select a specific version and track movements for API elements from it, generating required version branching as needed. Although I don't feel like it is a specifically useful feature, certainly not a high priority.

haxscramper commented 2 years ago

Determining what exactly has changed might be tricky, especially due to the https://github.com/haxscramper/hcparse/issues/21. It might eventually come down to diffs for heterogeneous trees, and then I try to produce minimal diff.

It is not problematic for functions - adding new arguments makes it a different function, so I will just put maximum allowed range boundary and then create new proc with different arguments.

This could've worked for types as well - different field means new type. This would create even larger wrappers since I would copy almost the same definition over and over again, but it is the simplest solution.