Fix the number of parameters calculation in create_config_mamba.py
Add model_flops_per_s in get_flops_per_sec method of the MambaModel class.
Here is a simple comparison between the formulae used in get_flops_per_sec(Estimated FLOPs) and the PyTorch flop_counter (Exact FLOPs) utility for different model sizes. The x axis is the d_model and the y axis is TFLOPS.
This PR does two things :
create_config_mamba.py
model_flops_per_s
inget_flops_per_sec
method of the MambaModel class.Here is a simple comparison between the formulae used in
get_flops_per_sec
(Estimated FLOPs) and the PyTorchflop_counter
(Exact FLOPs) utility for different model sizes. The x axis is thed_model
and the y axis is TFLOPS.