daadaada / turingas

Assembler for NVIDIA Volta and Turing GPUs
MIT License
201 stars 40 forks source link

Is there any plan to develop an instruction scheduler for turingas? #4

Closed xiaocenxiaocen closed 4 years ago

xiaocenxiaocen commented 4 years ago

Like megas, user can insert a code snippet into <SCHEDULER_BLOCK>... </SCHEDULE_BLOCK>. Then the scheduler will return an optimized instruction sequence. User do not care about the stall count of instructions in schedule blocks. It may be useful when writing sass code.

daadaada commented 4 years ago

While the algorithm to set stall cycles is not complex, I do not have the time and motivation to add support for "scheduler_block". I'd rather leave this task to high-level compilers.

You may set the stall cycles to the instruction latency to guarantee the program's correctness. In many cases, the latency can be hidden by switching to other warps.