Open s-jovic opened 1 month ago
@ttmtrajkovic @pavlepopovic There are two branches (both based on ppopovic/rebased_didt_tests
); the first one is relevant, the second one is probably obsolete now that I added the option to zero-out both inputs on any core. I will leave info just in case.
sjovic/rebased_didt_tests_zero_cores
This branch has WIP implementation of forcing zero inputs (both activations and weights) on certain cores. It works only for LM head matmul, and current state is that every other row of cores is set to receive zeros for activations and weights.
The idea is to zero out CB when one block of the input is received. 3 kernels needed to be updated for this for matmul 1D with in0 mcast:
reader_bmm_tile_layout_in0_sender_padding.cpp
- kernel that reads in0 from L1 interleaved and mcasts to all other cores; executes only on core (0, 0)reader_bmm_tile_layout_in0_receiver.cpp
- kernel that receives in0 from core (0, 0); goes on all other coresreader_bmm_tile_layout_in1_sender_writer_padding.cpp
- kernel that reads in1 from DRAM; goes on all coresIn kernels 2 and 3, we check if we are on the right row, and if so we zero out the CB each time. Kernel 1 currently has the update commented out, as we don't target the first row.
The solution is super-hacky and super-specific to LM head repro example and Bfp8. For this to work for our matmul 2D examples, kernels used for matmul 2D need to be updated accordingly. Also, the choice of which rows/columns/cores to zero out currently needs to be done in the code.
sjovic/rebased_didt_tests_zero_inputs
There are 4 options added to the python tests:
--activations-zero-percentage
[integer] - percentage of zeros to be generated for activations tensor
--weights-zero-percentage
[integer] - percentage of zeros to be generated for weights tensor
--zero-columns
[list of integers separated by comma] - accepts columns of cores that should have output zero and zeroes out part of activations/weights to achieve that
--zero-rows
[list of integers separated by comma] - same, only for rows
Matmul 1D - to achieve both zero columns and rows, we need to zero out a part (or multiple non-consecutive parts) of weights tensor Matmul 2D - to achieve zero columns, we need to zero out parts of weights, and vice-versa, to get zero rows, we need to zero out parts of activations
example usage:
WH_ARCH_YAML=wormhole_b0_80_arch_eth_dispatch.yaml pytest tests/didt/test_sharded_ff1.py -k "test_determinism and 2chips" --zero-columns 1,2