Closed apuaaChen closed 3 years ago
A bit of context: in maxas I remapped the barrier values to a 1 based index to make the control code notation a bit easier to read and work with.
@scott-gray Thanks for the context :p @apuaaChen Does this solve your problem?
@scott-gray Thanks for the context :p @apuaaChen Does this solve your problem?
Hi! Thank you for your reply. However, in the example there is
--:-:1:-:2 LDG.E tmp, [input_0];
02:-:-:-:2 STG.E [output_0], tmp;
It seems that barrier index 1 is associated with the second bit in the wait barrier mask. So if the index were 1-based, this would be a little counter-intuitive. By the way, I also tried
--:-:0:-:2 LDG.E tmp, [input_0];
01:-:-:-:2 STG.E [output_0], tmp;
after the modification, which provides the correct result.
I think your suggestion makes sense. I have made the barrier index 0-based. Thanks for the feedback.
I think your suggestion makes sense. I have made the barrier index 0-based. Thanks for the feedback.
Thank you!
Hi. I find that there might be a small typo in the grammar.py related to the read & write barrier index.
At line 389, the original code is
ctrl_re = r'(?P<ctrl>[0-9a-fA-F\-]{2}:[1-6\-]:[1-6\-]:[\-yY]:[0-9a-fA-F\-])'
, which suggests that the barriers are indexed by 1~6. However, I found that there is no corresponding Wait barrier mask for barrier index 6.After checking some sass code returned by the cuobjdump, I observed that barrier index 0 is usually used. So maybe its better to replace line 389 with
ctrl_re = r'(?P<ctrl>[0-9a-fA-F\-]{2}:[0-5\-]:[0-5\-]:[\-yY]:[0-9a-fA-F\-])'
.