Questions about our Runtime and Compiler interfaces

AlexandreEichenberger commented 2 years ago

I have a few general questions about our Runtime and Compiler interface. As I am working on a benchmarking benchmark, it is making me revisit our interfaces as I am trying to see if we can have our benchmarking not tied so tightly to ONNX-MLIR/MLIR/LLVM. This is what the Runtime and the Compiler interface are about.

C vs C++

Runtime Interface
- build in C and C++ library
- a common header file for the basic functionality (OnnxMlirRuntime.h), but there is an additional C++ only header file (OMTensorHelper.h)
- a common implementation file with the additional C++ OMTensor features added in the common implementation file. The extensions are not visible in the official interface.
Compiler interface
- probably meant to be built in C and C++ (functions are strictly C), but currently only built as C++ library
- the common header file would not currently compile in C (has a namespace under the C version)
- it would be nice to have an interface in C++
- no support for execution session

Are there folks that are interested in helping to clean up this, and have a discussion?

gongsu832 commented 2 years ago

Regarding the Runtime Interface, OnnxMlirRuntime.h sits under include, which is the public C/C++ runtime API header to be included by the users. OMTensorHelper.h on the other hand, sits under src/Runtime, which is part of the runtime implementation not visible to the users. So the two .h files don't really have direct relationship to each other. This is how it's designed.

I agree that the Compiler interface is only a first cut and not yet complete. I'm hoping that the original author @MikeHolman can help address the issues and also fix the broken test I mentioned in https://github.com/onnx/onnx-mlir/pull/958

AlexandreEichenberger commented 2 years ago

On my x86 docker, the make check-onnx-backend-compilerlib now works for me, I did some fixes to it when changing options, but I don't recall fixing anything with the libcruntime.

I agree that the two are separated, and only the C interface is exposed. The OMTensorHelper.h being only C++ file could be renamed using the hpp extension, though.

On the Runtime side, I understand the need of having the C interface, and it is constantly implemented, with all of the implementation of the OnnxMlirRuntime.h fully implemented in C (some cosmetic in C++ added).

This is not the case for OnnxMlirCompile.h. First the header does not compile in c

root@1250b049ac2b:~/onnx-mlir/build# gcc -c ../include/OnnxMlirRuntime.h  -I ../include
root@1250b049ac2b:~/onnx-mlir/build# gcc -c ../include/OnnxMlirCompiler.h  -I ../include
In file included from ../include/OnnxMlirCompiler.h:15:
../include/onnx-mlir/Compiler/OMCompilerTypes.h:8:1: error: unknown type name 'namespace'
    8 | namespace onnx_mlir {

This can be very simply fixed. But second, there is no C implementation, as it uses essentially the entire ONNX-MLIR. So while we could have a pure C header file, we cannot link the library implementing this interface using exclusively C runtime.

So this lead me to the next question: do we need the OnnxCompiler.h to be compilable in C? We certainly cannot have a C-only library of that interface.

Second, I think the interface is missing the "ExecuteSession" part of it. If the entire thing can be C++, then its easy, we have it. If we need a C version too, then we probably have to design a few more calls that mimic ExecutionSession.hpp in C.

@MikeHolman (or other MS folks that may use this interface, @sstamenova @NathanielMcVicar ) can you tell us on the usage you may have of this interface?

I would like to use it too for the benchmarking of kernels

MikeHolman commented 2 years ago

It doesn't technically need to be compatible with C, I just wanted a stable interface that didn't bring in dependencies like requiring specific libstdc++ version, etc. and in my experience C interface is the most robust way to go. We use this library to compile models, and we use a separate engine for executing them.

AlexandreEichenberger commented 2 years ago

@MikeHolman Thanks for the answer. What puzzle me is that the dependences basically include the entire LLVM/MLIR/ONNX-MLIR.

add_onnx_mlir_library(OnnxMlirCompiler
  OnnxMlirCompiler.cpp

  EXCLUDE_FROM_OM_LIBS

  LINK_LIBS PRIVATE
  CompilerUtils
  )

and CompilerUtils

add_onnx_mlir_library(CompilerUtils
  Compiler.cpp
  CompilerUtils.cpp

  EXCLUDE_FROM_OM_LIBS

  DEPENDS
  ExternalUtil

  INCLUDE_DIRS PRIVATE
  ${FILE_GENERATE_DIR}

  INCLUDE_DIRS PUBLIC
  ${ONNX_MLIR_SRC_ROOT}/include

  LINK_LIBS PUBLIC
  ${OMLibs}
  MLIRAffineTransforms
  MLIRLinalgTransforms
  MLIRLLVMToLLVMIRTranslation

  # Link LLVM libraries necessary to query which target architectures are configured.
  LINK_COMPONENTS PRIVATE
  AllTargetsAsmParsers
  AllTargetsCodeGens
  AllTargetsDescs
  AllTargetsInfos
  MC
  )

So doesn't that include already all of the C++ standard libs? Sorry if I miss something, or maybe it is because the import of CompilerUtils is in private, so you don't have the C++ in the public space? Just trying to understand. For the Runtime, there I understand well, as the entire OnnxMlirRuntime can be entirely compiled in C (meaning no C++ at all in the entire implementation).

Thanks for helping me understand

MikeHolman commented 2 years ago

Yes, LLVM/MLIR are a dependency of OnnxMlirCompiler, but the embedder of OnnxMlirCompiler doesn't need to know anything about them. As far as C++ compat goes, I'm not an expert (especially with Linux) but most of the incompatibilities I've experienced happen when passing stl containers across dll boundaries, and that was primarily what I was avoiding.

This is somewhat academic since C compat is not actually a need of ours, but I don't think OnnxMlirCompiler needs to be entirely C either. If the exports are all C compatible (which they should be since we are private linking with CompilerUtils), isn't that all that matters?

gongsu832 commented 2 years ago

To be consistent with OnnxMlirRuntime and since we cannot force everyone to use C++, OnnxMlirCompiler should also have a C interface.

AlexandreEichenberger commented 2 years ago

Thanks for your insight

AlexandreEichenberger commented 2 years ago

PR #1167 provides an interface to use onnx-mlir options.

AlexandreEichenberger commented 2 years ago

@MikeHolman please provide feedback if you can, could not add you as a reviewer, since you wrote the interface to begin with, you probably are interested by this changes.

AlexandreEichenberger commented 2 years ago

Next on my list for interfaces: right now, we have no "execution session" interface. I know that we can simply call

  OMTensorList *run_main_graph(OMTensorList *);

   OMTensorList *outputList = run_main_graph(input);

when the lib is statically linked. For dynamically linked, we provide (internally only) the ExecutionSession class. This one is not exported in our interfaces, and is strictly C++

It would not be too hard to make a C interface, which would depend on the dlopen/dlclose/dlsym calls defined in dlfin.h.

Any feedback? @gongsu832 @etiotto @MikeHolman @sstamenova @chentong319

gongsu832 commented 2 years ago

Not sure what you are asking. The model runtime is always generated as a shared lib. You can call run_main_graph directly if you link against model.so (it really should be called libmodel.so to be more conformant) at compile time. If you don't, you can use the dlopen/dlclose/dlsym calls to load the lib and resolve the run_main_graph symbol yourself at runtime and call it.

Are you saying that we should have a function like,

OMTensorList *run_main_graph2(OMTensorList *input) {
  handle = dlopen("libmodel.so");
  func = dlsym(handle, "run_main_graph");
  output = func(input);
  dlclose(handle);
  return output;
}

We can certainly have something like this. But if you are writing C/C++ code, it would be strange to not just link against libmodel.so.

AlexandreEichenberger commented 2 years ago

Yes, something of the sort... but to be more efficient for repeated calls, it is better to load the symbol once, then be "allowed" to run it many times... and the close it when done.

gongsu832 commented 2 years ago

It is loaded once when you link against libmodel.so and just repeatedly call run_main_graph. Pretty much the only difference between linking against libmodel.so and calling dlopen/dlclose/dlsym is whether you let the loader do the job of resolving run_main_graph for you or you do it yourself. So I'm still not sure what exactly you have in mind. Perhaps some example use case may help.

onnx / onnx-mlir

Questions about our Runtime and Compiler interfaces #1155