Add the option for fault injection sites: FP16 and MMA instructions.

Added the option for new fault injection sites, the support for FP16, and MMA instructions outside of G_OTHERS.

To add the support for these two instruction types, I had to modify the following files:

common/arch.h -> Added the enum that correspond to G_FP16 and G_MMA
common/globals.h -> Added const arrays that contain the identification of fp16 and mmaInst —also removed from the otherInst the correspondent for fp16 and mma.
injector/Makefile and profiler/Makefile -> upgrade the default sm version to sm_70 architectures, as Kepler is not supported for newer NVCC compilers. It is still possible to use older architectures by setting the sm.
profiler/inject_funcs.cu -> Added the support for the ballot_sync for architectures __CUDA_ARCH >= 700.
scripts/params.py -> The parameters file was updated to reflect the changes in the arch.h and globals.h
test-apps/simple_cublas -> Added a toy example to test the FP16 and MMA fault injection using CUBLAS.

NVlabs / nvbitfi