cloudcores / CuAssembler

An unofficial cuda assembler, for all generations of SASS, hopefully :)
MIT License
391 stars 69 forks source link

Insufficient basis, try CuAsming more instructions! #12

Closed rindlespot closed 1 year ago

rindlespot commented 1 year ago
2023-01-19 00:42:36,214 -          - Running CuAsmParser.parse...
2023-01-19 00:42:36,215 -    ENTRY -     Parsing file BuildSteps.sm_75.cuasm
2023-01-19 00:42:36,224 -          -     Running CuAsmParser.__preScan...
2023-01-19 00:42:37,001 -  WARNING - Line 19808: Weak symbol found! The implementation is not complete, please be cautious...
2023-01-19 00:42:37,005 -          -     Running CuAsmParser.__gatherTextSectionSizeLabel...
2023-01-19 00:42:37,006 -          -     Running CuAsmParser.__buildInternalTables...
2023-01-19 00:42:37,007 -          -     Running CuAsmParser.__evalFixups...
2023-01-19 00:42:37,009 -          -     Running CuAsmParser.__parseKernels...
2023-01-19 00:42:37,013 -     PROC -         Parsing kernel text of ".text._Z10DoEatStepsILi3EEvi"...
2023-01-19 00:42:37,057 -    ERROR - Assertion failed in:
    File BuildSteps.sm_75.cuasm:18452 :
        [B------:R-:W-:-:S02]         /*0d80*/              @!P0 IADD3.X R2, P1, R2, R15, RZ, P1, !PT ;
    Error when assembling instruction "[B------:R-:W-:-:S02] @!P0 IADD3.X R2, P1, R2, R15, RZ, P1, !PT ;":
        Assembling failed (NewVals): Insufficient basis, try CuAsming more instructions!
    Known Records:
        IADD3.X R4, P1, R17, R4, RZ, P1, !PT ;
        IADD3.X R16, P0, R5, R4, RZ, P0, !PT ;
        IADD3.X R2, P0, RZ, R0, RZ, P0, !PT ;
        IADD3.X R4, P0, RZ, R4, RZ, P0, !PT ;
        IADD3.X RZ, P0, R4, R4, RZ, P0, !PT ;
        IADD3.X R2, P0, R3.reuse, R3, RZ, P0, !PT ;
        IADD3.X R13, P1, RZ, R4, RZ, P0, !PT ;
        IADD3.X R44, P5, R5, ~R28, RZ, P5, !PT ;
        IADD3.X R24, P3, ~R0, R37, RZ, P0, !PT ;

How do I "CuAsm more instructions?"

BuildSteps.sm_75.cuasm is the unmodified output from cuasm.cmd BuildSteps.sm_75.cubin.

FYI: The 'weak' symbol is .weak $_Z11DoInitFirstj$__cuda_sm20_rem_u64.

rindlespot commented 1 year ago

SOLVED

From the error message, I assumed that something about my code needed to process more instructions at a time in order to be able to compute the correct cubin output. But that's not the problem at all.

Since there is no official reference for the SASS language, cuasm has generated its tables for converting SASS text to bytecode by looking at disassembled SASS files. The author has combined all the information he's found to InsAsmRepos/DefaultInsAsmRepos.sm_??.txt, but (obviously) anything he hasn't yet encountered isn't there. If your code uses SASS instructions/modifiers that aren't in that file, this is (one of) the errors you get.

But you can use cuasm to add more instructions to those tables (that's what the message means). There are directions here that talk about how to use cuobjdump and CuInsAssemblerRepos to read the exported SASS instructions from your current code and update the repository.

To be clear: While the link talks about getting information from cublas64_11, if you're getting this message you'll want to run it against a cuobjdump of your own code since that's where the unmapped instructions are.