UNM_NUM and UNM_VEC were both implemented assuming SSE-style restrictions (2-argument form), but using AVX that doesn't have them. There's no need to copy source to destination separately - we can just vxorpd into destination.
Most occurrences of UNM_NUM/UNM_VEC followed the self-xor path, but this saves a couple instructions in trig benchmark and makes it execute ~0.1% fewer instructions (the actual runtime delta is within the noise).
UNM_NUM and UNM_VEC were both implemented assuming SSE-style restrictions (2-argument form), but using AVX that doesn't have them. There's no need to copy source to destination separately - we can just vxorpd into destination.
Most occurrences of UNM_NUM/UNM_VEC followed the self-xor path, but this saves a couple instructions in trig benchmark and makes it execute ~0.1% fewer instructions (the actual runtime delta is within the noise).