Reimplements x86 bitscan and popcnt

The previous implementation did a naive test of every single bit. This resulted in a giant chain Ite expressions, which is probably hard on SMT solvers and at the very least does not seem very readable to me.

This reimplements the lifting of bsr, bsf, lzcnt, tzcnt, and popcnt instructions based on the branchless algorithms described in Hacker's Delight, chapters 5-3 and 5-4: