When calculating the size of the workspace for a given prim func, the lanes of the data type was not being considered, meaning sizes calculated for dtypes such as "float32x4" were smaller than what they should be. This commit also considers lanes in the calculation.
When calculating the size of the workspace for a given prim func, the lanes of the data type was not being considered, meaning sizes calculated for dtypes such as "float32x4" were smaller than what they should be. This commit also considers lanes in the calculation.