Closed dereklai1 closed 3 months ago
@dereklai1
Hi, we found there is a bug the yml files that stop CI running from forked repos. It has now been fixed and merged into the upstream. Could you merge the main branch of the upstream into this PR which should trigger the CI properly?
Thanks,
Hardware Normalization (WIP)
Note: There are a bunch of unrelated changes mainly in
mase_components/matmul/
(duplicate of #53) due to our rebase onto the HEAD of the private repomase-tools
. Every relevant change to the coursework is detailed below.Software:
BatchNorm2d
,LayerNorm
,GroupNorm
,InstanceNorm2d
, andRMSNorm
intoQUANTIZABLE_OP
.BatchNorm2dInteger
,LayerNormInteger
,GroupNormInteger
,InstanceNorm2dInteger
,RMSNormInteger
.emit_internal
,emit_logicnets
, etc.. to delete redundantINTERNAL_RTL_DEPENDENCIES
dictionary and replaced it withINTERNAL_COMP
.emit_tb
pass to use string templating instead of dill'ing a testbench class.test_emit_verilog_norm.py
which tests all types of normalization layer hardware generation starting from PyTorch NNs. (RMSNorm currently not working due to circular import problem)Hardware:
fixed_isqrt
which uses Newton-Raphson iteration to compute inverse square roots.batch_norm_2d
which implements batch normalization including affine transformations.group_norm_2d
which can be specialized to implement group norm, layer norm, and instance norm and instance norm without any affine transforms.norm
for all normalization layers. The type of normalization can be chosen through parameters.ErrorThresholdStreamMonitor
stream monitor which checks that the output value is within a certain threshold of last-bit errors.norm_synth_impl.tcl
Vivado script andalveo-u250-norm.xdc
to run synthesis and implementation automatically and generate resource usage and timing reports for various parameterizations.simulate_pass()
intomase_cocotb.runner
to support running MASE-generated hardware projects.group_norm_2d
andfixed_isqrt
which can be used to check hardware errors. Note that the quantized software layers are used in the respective layers and not the hardware behavior models.Minor Changes:
matrix_fifo
which combinesmatrix_flatten
,fifo
andmatrix_unflatten
.fifo
andrepeat_circular_buffer
.fixed_signed_cast_tb
to extract out software model.mase_cocotb.monitor
andmase_cocotb.driver
.mase_cocotb.runner
and fixed compile options for Verilator.mase_cocotb.utils
.quantizers.integer
.Further Work:
fixed_accumulator
module.RMSNorm
integration in metadata passes.