NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines
Other
4.95k stars 847 forks source link

[QST] How to get Cutlass to run on Windows 11 in Visual Studio? #1594

Closed mrakgr closed 4 weeks ago

mrakgr commented 1 month ago

Link: https://github.com/mrakgr/Cutlass-Tests/blob/632caad823bbb86eddb012488926030fd929f54e/Cutlass%20Tests/kernel.cu

I am trying to go through the intro examples and am stuck on the second one. I am sure that the two include paths are correct, but when I try to build the project there are a ton of errors.

Severity    Code    Description Project File    Line    Suppression State   Details
Warning C26495  Variable '__half2::y' is uninitialized. Always initialize a member variable (type.6).   Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  316     
Error (active)  E0304   no instance of overloaded function "cutlass::gemm::device::Gemm<ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, LayoutC_, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ThreadblockSwizzle_, Stages, AlignmentA, AlignmentB, SplitKSerial, Operator_, GatherA, GatherB, ScatterD, PermuteDLayout>::operator() [with ElementA_=cutlass::half_t, LayoutA_=cutlass::layout::ColumnMajor, ElementB_=cutlass::half_t, LayoutB_=cutlass::layout::ColumnMajor, ElementC_=cutlass::half_t, LayoutC_=cutlass::layout::ColumnMajor, ElementAccumulator_=float, OperatorClass_=cutlass::arch::OpClassTensorOp, ArchTag_=cutlass::arch::Sm89, ThreadblockShape_=<error-type>, WarpShape_=<error-type>, InstructionShape_=<error-type>, EpilogueOutputOp_=<error-type>, ThreadblockSwizzle_=cutlass::gemm::threadblock::GemmIdentityThreadblockSwizzle<1>, Stages=<error-constant>, AlignmentA=<error-constant>, AlignmentB=<error-constant>, SplitKSerial=false, Operator_=<error-type>, GatherA=false, GatherB=false, ScatterD=false, PermuteDLayout=cutlass::layout::NoPermute]" matches the argument list Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   55      
Warning C26439  This kind of function should not throw. Declare it 'noexcept' (f.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  315     
Warning C26495  Variable '__nv_bfloat162::x' is uninitialized. Always initialize a member variable (type.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  315     
Warning C26495  Variable '__nv_bfloat162::y' is uninitialized. Always initialize a member variable (type.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  315     
Warning C26478  Don't use std::move on constant variables. (es.56). Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  315     
Warning C26439  This kind of function should not throw. Declare it 'noexcept' (f.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  316     
Warning C26478  Don't use std::move on constant variables. (es.56). Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  316     
Warning C26495  Variable '__nv_bfloat162::x' is uninitialized. Always initialize a member variable (type.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  320     
Warning C26495  Variable '__nv_bfloat162::y' is uninitialized. Always initialize a member variable (type.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  320     
Warning C26495  Variable '__nv_bfloat162::x' is uninitialized. Always initialize a member variable (type.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  322     
Warning C26495  Variable '__nv_bfloat162::y' is uninitialized. Always initialize a member variable (type.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp  322     
Warning C26439  This kind of function should not throw. Declare it 'noexcept' (f.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  309     
Warning C26495  Variable '__half2::x' is uninitialized. Always initialize a member variable (type.6).   Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  309     
Warning C26495  Variable '__half2::y' is uninitialized. Always initialize a member variable (type.6).   Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  309     
Warning C26478  Don't use std::move on constant variables. (es.56). Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  309     
Warning C26439  This kind of function should not throw. Declare it 'noexcept' (f.6).    Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  310     
Warning C26478  Don't use std::move on constant variables. (es.56). Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  310     
Warning C26495  Variable '__half2::x' is uninitialized. Always initialize a member variable (type.6).   Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  314     
Warning C26495  Variable '__half2::y' is uninitialized. Always initialize a member variable (type.6).   Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  314     
Warning C26495  Variable '__half2::x' is uninitialized. Always initialize a member variable (type.6).   Cutlass Tests   C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp  316     
Error       namespace "std" has no member "invoke_result_t" Cutlass Tests   G:\cutlass-3.5.0\include\cute\util\type_traits.hpp  126     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 41      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 80      
Error       constant "v" is not used in or cannot be deduced from the template argument list of class template "cute::is_integral<cute::C<<error>>>"    Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 80      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 95      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 97      
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, const T>"    Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 97      
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 98      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 99      
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, const T &>"  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 99      
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 100     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 101     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, T &>"    Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 101     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 102     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 103     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, T &&>"   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 103     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 104     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, cute::C<<error>>>"   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       constant "v" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, cute::C<<error>>>"   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 106     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 107     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, cute::integral_constant<T, v>>"  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 107     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 108     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 188     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 189     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 190     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 191     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 192     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 194     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 194     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 195     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 195     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 196     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 196     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 197     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 197     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 198     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 198     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 199     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 199     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 200     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 200     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 201     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 201     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 202     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 202     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 203     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 203     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 205     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 205     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 206     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 206     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 208     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 208     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 209     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 209     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 210     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 210     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 211     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 211     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 212     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 212     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 213     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 213     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 223     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 231     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 239     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 247     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 255     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 263     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 271     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 279     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 287     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 295     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 303     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 340     
Error       no instance of function template "cute::abs" matches the argument list  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 340     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 341     
Error       no instance of function template "cute::signum" matches the argument list   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 341     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 342     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       no instance of function template "cute::max" matches the argument list  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       no instance of function template "cute::min" matches the argument list  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error   MSB3721 The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" -gencode=arch=compute_89,code=\"sm_89,compute_89\" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include"  -G   --keep-dir x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static  -g  -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/Fdx64\Debug\vc143.pdb" -o "C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\x64\Debug\kernel.cu.obj" "C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu"" exited with code 4.  Cutlass Tests   C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.3.targets 799     
mrakgr commented 1 month ago

As an aside, I couldn't even get a Cuda hello world project to run with CMake, so there is no way I could run the tests for Cutlass in the quickstarts. Is the library supposed to be Linux only?

mrakgr commented 1 month ago

Forgot to mention, but I did set the C++ standard to 20.

mrakgr commented 1 month ago

Link: https://github.com/NVIDIA/cutlass/blob/main/media/docs/build/building_in_windows_with_visual_studio.md

Found this via a web search, I'll try following the instructions in it.

mrakgr commented 1 month ago

No sorry, there is nothing relevant there to my issue. I can't even run CMake on a Cuda project to begin with.

mrakgr commented 1 month ago

By the way, we're making a ~video on using Cutlass on Youtube~. You can see some of our work here. A few months ago, I anticipated these kinds of issues and didn't want to deal with the pain of getting a C++ project to run, so I started this Cuda matrix multiplication playlist instead. It was a good exercise that allowed me to implement the tensor datatype in Spiral, and get me familiar with GPU programming, even so, I admit it would have made more sense to use Cutlass to begin with since it took 2 months to make that matrix multiply.

So now that we've actually started work on an ML library and have a need for a wider array of functionality than the single Ampere matrix multiply that we've implemented provides, we are looking into Cutlass again. If we could get it to run, we'll integrate it into the library and have Cutlass be responsible for the matrix multiplication heavy lifting.

We'll be covering that in the video, of course that means, we'll also be covering how to use Cutlass in a Cuda kernel directly. Hopefully, others will find that interesting.

thakkarV commented 1 month ago

Link: https://github.com/mrakgr/Cutlass-Tests/blob/632caad823bbb86eddb012488926030fd929f54e/Cutlass%20Tests/kernel.cu

I am trying to go through the intro examples and am stuck on the second one. I am sure that the two include paths are correct, but when I try to build the project there are a ton of errors.

The first link is a 404 for me. The second link points to a read me instead of an example. It is likely that the code snippets in our read me could have drifted out of sync with the repo. I would highly recommend building the actual examples under the cutlass/examples/ directory instead. They are guarded by CI tests.

The readme you found is the documentation we have for Windows builds. It is best to use the latest MSVC you can find. I am curious if you are able to build any of the examples in the examples directory with our cmake

mrakgr commented 1 month ago

Ah, it seems the repo has been made private by accident. Sorry, I didn't expect the IDE to do that. Now it is public.

I'll try building the examples in the directory you pointed me to and get back to you.

mrakgr commented 1 month ago
mkdir build && cd build
cmake .. -DCUTLASS_NVCC_ARCHS=89

Surprisingly, running these two commands does in fact work, generates all the build scripts in the build directory. I expected I'd run into the same issue as when running the Cuda hello world project.

But all the examples in the quickstart readme use the Linux make utility, while I am on Windows, so I am not sure how to run them. This is my first time trying to use CMake. I'll try building the individual examples next and seeing where that gets me.

thakkarV commented 1 month ago

Linux make utility, while I am on Windows, so I am not sure how to run them.

Make runs downstream of cmake. the quickstart is written with a linux system as an example, but the source of truth in generating any build system configuration is using cmake. we do not support users customizing flags to their build system / compiler outside of those generated by our build system

mrakgr commented 1 month ago
cmake --build . --parallel

It seems that running this command does build them, which is something. Right now, it's in the process of doing it.

But whether cmake works is beside the point for me, as what is really important that I understand how to use the library. When I copy the code from the from the one example in 00_basic_gemm into my own project the first issue was that the #include "helper.h" was not found. That file is in examples/common so I added it to the include path. But now it is complaining that threadIdx.x is not found. And there is a bevy of errors similar to the ones in the first post.

Severity    Code    Description Project File    Line    Suppression State   Details
Error (active)  E0020   identifier "threadIdx" is undefined Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   161     
Error (active)  E0020   identifier "blockIdx" is undefined  Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   161     
Error (active)  E0020   identifier "blockDim" is undefined  Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   161     
Error (active)  E0029   expected an expression  Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   185     
Error (active)  E0020   identifier "threadIdx" is undefined Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   244     
Error (active)  E0020   identifier "blockIdx" is undefined  Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   244     
Error (active)  E0020   identifier "blockDim" is undefined  Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   244     
Error (active)  E0029   expected an expression  Cutlass Tests   C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu   278     
Error       namespace "std" has no member "invoke_result_t" Cutlass Tests   G:\cutlass-3.5.0\include\cute\util\type_traits.hpp  126     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 41      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 80      
Error       constant "v" is not used in or cannot be deduced from the template argument list of class template "cute::is_integral<cute::C<<error>>>"    Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 80      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 95      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 97      
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, const T>"    Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 97      
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 98      
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 99      
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, const T &>"  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 99      
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 100     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 101     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, T &>"    Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 101     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 102     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 103     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, T &&>"   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 103     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 104     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, cute::C<<error>>>"   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       constant "v" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, cute::C<<error>>>"   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 105     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 106     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 107     
Error       constant "n" is not used in or cannot be deduced from the template argument list of class template "cute::is_constant<<error>, cute::integral_constant<T, v>>"  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 107     
Error       the template argument list of the partial specialization includes a nontype argument whose type depends on a template parameter Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 108     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 188     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 189     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 190     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 191     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 192     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 194     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 194     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 195     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 195     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 196     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 196     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 197     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 197     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 198     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 198     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 199     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 199     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 200     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 200     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 201     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 201     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 202     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 202     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 203     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 203     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 205     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 205     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 206     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 206     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 208     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 208     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 209     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 209     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 210     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 210     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 211     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 211     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 212     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 212     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 213     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 213     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 223     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 231     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 239     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 247     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 255     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 263     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 271     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 279     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 287     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 295     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 303     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 340     
Error       no instance of function template "cute::abs" matches the argument list  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 340     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 341     
Error       no instance of function template "cute::signum" matches the argument list   Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 341     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 342     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       no instance of function template "cute::max" matches the argument list  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 344     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       no instance of function template "cute::min" matches the argument list  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 345     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 346     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error       "auto" is not allowed here  Cutlass Tests   G:\cutlass-3.5.0\include\cute\numeric\integral_constant.hpp 347     
Error   MSB3721 The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" -gencode=arch=compute_89,code=\"sm_89,compute_89\" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include"  -G   --keep-dir x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static  -g  -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/Fdx64\Debug\vc143.pdb" -o "C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\x64\Debug\kernel.cu.obj" "C:\Users\mrakg\source\repos\Cutlass Tests\Cutlass Tests\kernel.cu"" exited with code 4.  Cutlass Tests   C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.3.targets 799     
mrakgr commented 1 month ago

I am curious, what kind of setup are you using when you are developing these examples yourself? Even if not Visual Studio, how could I edit, play with and run these examples in something like VS Code?

thakkarV commented 1 month ago

You have to either ideally import cutlass as a dependency via cmake and that will result in a seamless integration without you having to manually edit flags etc OR copy paste the entirety of compiler command line and append to your project's builds.

mrakgr commented 1 month ago

I am not sure where'd I'd get the latter, but as for the first...

Since it is a header library, am I not supposed to simply add the include directories and then the necessary headers? The quickstart guide examples make it seem that is how it is supposed to be used. Right now, I cannot get even a Cuda hello world project to run with CMake on my own, so figuring out how to add Cutlass as a dependency that way seems very hard.

mrakgr commented 4 weeks ago

Instead of us going back and forth over this, and me asking dumb questions, would it be possible for you to provide some examples that somebody without intricate knowledge of C++'s build system would be able to run, please? Something like Nvidia does here. Just to even get started with Cutlass, I had to study CMake for a few hours and in the end, I couldn't even run a hello world Cuda program with it.

I mean it constructively when I say that you are hugely overestimating how easy it is for somebody to come in and use the Cutlass library right now.

thakkarV commented 4 weeks ago

here is an example: https://github.com/thakkarV/cuASR/blob/master/CMakeLists.txt

it is somewhat out of date at this point, but enough to get you started. The one thing I would change for this example is that rather than setting the flags manually, I would add directory for CUTLASS Cmake first and then simply append any custom flags to the ones that cutlass cmake file generates

mrakgr commented 4 weeks ago

I'll give that a try, but my request would be to get a Cutlass example to work as a plain VS project, similar to the ones on the NVidia sample repo. Maybe if we were doing a C++ project that utilizes Cutlass, depending on CMake would be fine, but in the Youtube channel we're doing the ML library project in Spiral which compiles to Python on the host, and Cuda C++ on the device. Including the Cutlass directory as a part of the NVCC compiler options is the limit of how complex the setup can be.

Before that, I need to get it to run in VS, so I can anchor the project to a working state and be capable of debugging it in the IDE.

I presume this should be possible, I just don't understand why I am getting those weird types errors in integral_constant.h when I copies the second quickstart example into my project. One assumption is that the C++ standard is out of date, but I did set it to C++ 20 in the project settings. And when I copied the 00_basic_gemm project instead, I am still getting those type errors, but I am also missing threadIdx.x. That one should be in #include "cuda_runtime.h". I don't understand why that is not getting included anywhere. The test itself builds with Cmake, but cuda_runtime.h is not being included in any file referenced by the main project file.

I am sorry for pestering you with this, but I am blocked and not sure how to proceed currently.

thakkarV commented 4 weeks ago

Did the example I linked not help? You do not have to use cmake yourself - you can just copy the flags that CUTLASS CMake generates on windows and use those in your project. I suspect you are likely missing some include path OR you are not building with NVCC.

Please do not copy code out of the markdown documentation as that is not guarded by CI. Anything from the example or test dir should just work.

thakkarV commented 4 weeks ago

Actually yes, MSVC should never be able to see threadIdx during compilation. That should only exist in device code

thakkarV commented 4 weeks ago

Maybe @mhoemmen can help

mrakgr commented 4 weeks ago

You do not have to use cmake yourself - you can just copy the flags that CUTLASS CMake generates on windows and use those in your project.

I see, I didn't realize that is what you meant. Hmm, I'll have to figure out how to find those flags.

mrakgr commented 4 weeks ago

Ok, I see them. Not in the project you've linked me to, but in the CMake build of Cutlass that I did yesterday. All the examples have a bunch of .vcxproj files in their directories, and they do run. I mean, the first one that I tested does, but I guess the other executables would as well.

I understand what I should be doing now.

mrakgr commented 4 weeks ago

The only issue that I have is that looking at the .vcxproj files is overwhelming, there are a bunch of them, and each of them are at least 10kb of XML to go through.

I did a hello world C++ project with CMake yesterday, and unlike the build Cutlass examples, I realize now that particular project also has the Visual Studio .sln file in its build folder as well. That allows me to open the project directly in the IDE. It's too bad it's missing in the Cutlass examples, as that would allow me to open and edit them directly in the IDE. Maybe there is a way to create them...

Do you have any idea why the Cutlass CMake build script isn't producing them?

mrakgr commented 4 weeks ago

It turns out you can open the .vcxproj files directly in the IDE. So, the solution file not being there is not a problem.

It is actually possible to build the individual files from inside the IDE. Here are the build options for the first example.

>G:\cutlass-3.5.0\build\examples\00_basic_gemm>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -I"G:\cutlass-3.5.0\include" -I"G:\cutlass-3.5.0\examples\common" -I"G:\cutlass-3.5.0\build\include" -I\include -I\examples -I"G:\cutlass-3.5.0\tools\util\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include"     --keep-dir x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart shared -std=c++17 --generate-code=arch=compute_89,code=[sm_89] --generate-code=arch=compute_89,code=[compute_89] --expt-relaxed-constexpr -Xcompiler="/EHsc /Zc:__cplusplus /bigobj -Zi -Ob0 /wd4819 /fp:strict" -g  -D_WINDOWS -DCUTLASS_VERSIONS_GENERATED -DCUTLASS_ENABLE_TENSOR_CORE_MMA=1 -DCUTLASS_TEST_LEVEL=0 -DCUTLASS_TEST_ENABLE_CACHED_RESULTS=1 -DCUTLASS_CONV_UNIT_TEST_RIGOROUS_SIZE_ENABLED=1 -DCUTLASS_DEBUG_TRACE_LEVEL=0 -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd /GR" -Xcompiler "/Fd00_basic_gemm.dir\Debug\vc143.pdb" -o 00_basic_gemm.dir\Debug\basic_gemm.obj "G:\cutlass-3.5.0\examples\00_basic_gemm\basic_gemm.cu"

The only issue with this is that for some reason the Intellisense isn't working properly for the examples even though they build correctly.

mrakgr commented 4 weeks ago

The solution to the Intellisense issue is to just take the directories in Cuda C/C++ -> Additional Include Directories

image

And then paste them into VC++ Directories -> Include Directories like I just did here.

image

The only other problem is that it cannot recognize Cuda buildins like threadIdx.

image

To resolve that you can add #include "device_launch_parameters.h" at the top of the .cu file.

image

I am surprised it is even compiling without that header. Anyway, it works, and I've met my goal of getting it to work on Windows and Visual Studio. I should be able to actually study the library properly from here on out.

image

The only nitpick that I still have is that for some reason opening the 00_basic_gemm.vcxproj file opens literally every single project in the example folder as you can see in the screenshot above which defeats the point of them having individual project files. Maybe that could be something you should look into in the CMake files?

I'll close this here. Thank you for the help.

mrakgr commented 3 weeks ago

It's probably opening all the project files because instead of opening the individual project, it is instead running the solution file in the outer directory instead.

image

You can see it right here. This answers the question why the solution file is not being generated for the individual examples. It's because only a single one is being generated for the entire project.

mrakgr commented 1 week ago

The video is out now, though I think that just looking into this issue is easier to get a sense of what should be done. The sheer amount of stuff CMake produces made my brain freeze, so I did need some help to goad me in the right direction. Thanks again.

More than just getting started with it, I'll probably make a video on using Cutlass when the 50xx cards come out and need to upgrade the matrix multiply for the ML library.

mhoemmen commented 19 hours ago

@mrakgr Thank you for your interest and engagement!

https://github.com/NVIDIA/cutlass/blob/main/media/docs/build/building_in_windows_with_visual_studio.md gives instructions for running CMake and building on Windows, both from the command line. (I recommend the git bash shell, but others should work as well.) You can even build a specific target, e.g., by replacing <TARGET_NAME> in the command below with the path to the target's .vcxproj file (not including the .vcxproj extension).

cmake --build . --config Debug -j 4 -t <TARGET_NAME>

If I find myself needing an IDE, I just load up the CMake-generated CUTLASS.sln file in Visual Studio.