cuBLASXt API - Githubissues

fbordignon commented 3 years ago

@kunzmi et. al. great work with the lib. I got it crunching C# matrices in no time. Now I've got a huge matrix to deal with and encountered the cublasXt api. I believe it can decompose the huge matrices under the hood for me, I would like to contribute with the project by adding access to this api for managedCuda. Can you help me on where to start? I believe that they use a different handle cublasXtHandle_t and other data types. Would that be a problem? (meaning too much time to implement) I've glanced the source code and it seem rather repetitive, is that any kind of automatic tool to extract the dll imports from cublas? Here is a link to the relevant cublas section of the docs. https://docs.nvidia.com/cuda/cublas/index.html#using-the-cublasXt-api

Thanks!

kunzmi commented 3 years ago

Hi!

yes, the _Xt libraries are separated libraries addressing the multi-GPU part (same for CUFFT etc.) and would need a dedicated wrapper in managedCuda. I left them out so far, mainly as a lack of time but you’re more than welcome to contribute!

Unfortunately, writing the C# code is more or less entirely hand-crafted work. I usually just copy&paste the repetitive parts with regex-supported find&replace. The main issue is that from the API or header file it is often impossible to deduce if int* is to be transformed to int[], ref int or CUdeviceptr in C#. If this information is given, one can usually only find it in the PDF/html documentation and I never went that far to parse these automatically and sometimes one also has to try it out (I even reported some bugs on that to Nvidia in the early days).

I hope this won’t discourage you to add the support for cublasXt ;)

fbordignon commented 3 years ago

Hey @kunzmi, thanks for the pointers. I will start by the functions I need for now and work my way up to the remainder of the api. No promises of release date or that it will be complete, but I will try my best. Well, one thing that may help is that the xt api has more or less the same functions as the common one, so it should speed things up.

kunzmi / managedCuda

cuBLASXt API #99