NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines
Other
5.3k stars 892 forks source link

[QST]Question about the cutlass 3.0 API #1588

Open sleepwalker2017 opened 2 months ago

sleepwalker2017 commented 2 months ago

I see there are two sets of APIs to do a gemm using cutlass. The two are https://github.com/NVIDIA/cutlass/blob/main/media/docs/quickstart.md#launching-a-gemm-kernel-in-cuda and https://github.com/NVIDIA/cutlass/blob/main/media/docs/quickstart.md#launching-a-gemm-kernel-using-cutlass-30-or-newer

I have some questions:

  1. The new API require specifying launching configuration (threads, blocks)by users. Is this a must? How can we ensure that the user specifies the optimal configuration?
  2. The original interface did not require specifying the launch configuration. Was it optimal then?
  3. The new API seems much more complicated compared to the old one. What are the advantages of this set of interfaces?

Thank you!

github-actions[bot] commented 1 month ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.