Closed masa-laboratory closed 2 months ago
nice feature
lol.
nice feature
lol.
- Why not just grep the first warp from the trace file? Is there any difference?
- The address looks weird. Shouldn't it be 64-bit?
- Is there only 1 warp in the example you show? What happens when there are multiple warps? I don't see warp id referenced in your code.
- There is too much duplicate code. Please reuse the original code and create functions if possible.
Using grep of the first warp will generate some duplicate instructions. An example:
In this simple case, the kernel _Z7mat_mulPfS_S_
has 4 warps. If we grep the traces of the first warp (ctaid_x=0, ctaid_y=0, ctaid_z=0, warp_id=0), we will get 2 sequences for the PC 0x00f0.
We may only want to obtain the instructions corresponding to a single PC, instrset.csv
can be used as a lookup table. The instrset.csv
has only one sequence for each unique PC.
But there is a problem. The addresses in the LD/ST instructions executed by each warp may be different, or the immediate values of other instructions may also be different. What I can think of is that users can use instrset.csv
to check the instruction opcode, register number, etc., but don't pay too much attention to the address or immediate value. For example, I want to see the instruction opcode corresponding to a certain PC value to determine what execution unit it should be issued to, or I also want to find the register numbers of an instruction to calculate the bank IDs.
There are 4 warps in the above case. As described in 2, instrset.csv
is only a lookup table, so there is no warp_id references.
Indeed. the code needs to be improved. The address output format has some issues.
okay now I understand what you want to do. For PC you can get it with cuobjdump
, but not the register. So I guess this can be helpful even though I would probably just grep the inst.
But I cannot accept as it is.
Hi,
I have a similar system to the one proposed here (but with different purposes), and I think I can provide useful insights.
I think it is better to record the instruction information inside instrument_function_if_needed with the instr->getSass() than later in recv_thread_fun. In the way I propose, more information is recorded (like immediate or other kinds of registers). Moreover, it will be agnostic to the MREF changes.
Regarding the PC, I agree with JRPan. Inside a kernel (.traceg), there can be equal PCs belonging to different functions. If you want to do it properly, you need to add (int)instr->getOffset() to (uint64_t)nvbit_get_func_addr(f);. However, you may end with some big numbers that look weird. If you can solve that problem, you can have some maps of addr_func and some unique_function_id numbers to make conversions. Later, in the output file you print that association and the unique_function_id, vpc.
By the way, if you also want that the ICache during simulation does not have false hits, you will also have to build the access address with (int)instr->getOffset() + (uint64_t)nvbit_get_func_addr(f);.
Hopefully, I will end someday with the thing that I'm doing in my private repo. and I try to merge it.
Add the function to generate SASS instruction set sequences with unique kernel_id and PC value, as someone may not want to see all instructions executed by all warps, but only the sequence of instructions with unique PC value whitin a single kernel. As long as the
DUMP_INTERSET
switch is turned on, the instruction setinstrset.csv
can be obtained in thetraces/
folder.An example of
traces/instrset.csv
: