I'm not sure if the clock tx/rx data improves anything, but it feels better to have registers connected to the GTX instead of lots of async logic
The change in the almost_full check is supposed to make the tools only check if the two MSB are set instead of comparing the whole level vector. It looks like at least Vivado gets it right, but I think an even better fix would be if we could explicitly do this check (perhaps subclass Buffer and do it there?)
We have seen with Chipscope that tx_startup_fsm has jumped directly from RELEASE_MMCM to WAIT_ALIGN, probably because mmcm_locked wasn't properly synced. We haven't confirmed that the change to cplllock does anything, but it's a bit scary that it goes unsynchronized into a FSM.
Some notes