For '_mm256_blendv_ps' and '_mm256_blendv_pd', the selector is
controlled by the highest flag of each unit, while the mask is
presented as a float or double SIMD register. For the case of
'-0.0', which is intepreted as 0x8000..00 in register, digits
in the 'b' input should be selected.
Previous implementation used 'vcgeq_f32' and 'vcgeq_f64' to get
flags of selector. These intrinsics won't think '-0.0 < 0.0',
thus digits in 'a' should be selected and an incorrect result
would be returned.
This fix convert the comparing from float intrinsics to integer
ones and preserve the correction of '-0.0' case.
For '_mm256_blendv_ps' and '_mm256_blendv_pd', the selector is controlled by the highest flag of each unit, while the mask is presented as a float or double SIMD register. For the case of '-0.0', which is intepreted as 0x8000..00 in register, digits in the 'b' input should be selected. Previous implementation used 'vcgeq_f32' and 'vcgeq_f64' to get flags of selector. These intrinsics won't think '-0.0 < 0.0', thus digits in 'a' should be selected and an incorrect result would be returned. This fix convert the comparing from float intrinsics to integer ones and preserve the correction of '-0.0' case.