Open teju85 opened 2 years ago
I think it would make sense to go the same route as with cugraph-ops: if we expose things using cython, and only use the __cuda_array_interface__
for integrating with DL frameworks, then we're able to circumvent this issue entirely, because we won't interact with pytorch at a C++ level at all and so the ABI level won't matter.
The issue only arises if you link against torchlib or some other pytorch backend directly, which you have to do of course if you want to use c10 or other C++ backends from pytorch directly.
I think this issue should have been addressed in PR #24 .
So there is no C10 API used during build. And it can build and run without C10 API.
Moreover, as C10 API may have slightly better performance than python callback bindings. The codes are kept in python/pylibwholegraph/pylibwholegraph/torch_cpp_ext
directory. They are not built at build and packaging time, just packaging the source code. If it is really helpful on performance, user can complie them at runtime on their machine manually as an optimizing option, by calling compile_cpp_extension. compile_cpp_extension
is the only usage of C10 API, it is up to user and run only on user's machine with known PyTorch version. If users don't care about that, all the files under torch_cpp_ext are not used.
@BradReesWork I think we can we can close this issue now.
@BradReesWork JFYI... Currently, WM directly uses C10 API's from pytorch. Eg:
wholegraph/torch/whole_nccl_pytorch_tensor.h
. This means, similar to the pytorch backend of cugraph-ops, we'll also have to think about the topic ofCXX_ABI
during its build! This can especially become complicated when WM moves inside cuGraph.