Various bugfixes that end up changing the results somewhat. Biggest change to the conclusion is that the MicroPrice occurs slightly outside the bid-ask for the smallest spread states.
Please see each commit, but in summary:
Fixed incorrect imbalance x-points used for plotting.
Imbalance discretized with symmetric buckets around 0.5. Essential due to how buckets are generated for the symmetrized data.
Fixed Spread series is not set as a column in the df as intended, but instead as an attribute, incorrectly causing 5k data points of largest spread to be filtered out.
The way function estimation() uses slicing and pivot_table, can cause silent misalignment of states and indices in the np.ndarray.
WMID in the plot is only valid when spread=1tick. Normalize y-axis by spread to make WMID valid for any spread.
Various bugfixes that end up changing the results somewhat. Biggest change to the conclusion is that the MicroPrice occurs slightly outside the bid-ask for the smallest spread states.
Please see each commit, but in summary: