CommonLit Readability Prize

background

Summary

The RMSE of the final result is 0.4607133200425228 with 0.7 * ensemble + 0.3 * stacking

num_bins = int(np.floor(1 + np.log2(len(data))))
target_bins = pd.cut(data["target"], bins=num_bins, labels=False)

	RMSE
roberta-base + attention head + layer norm	0.473694
roberta-base + attention head	0.470910
roberta-base-squad2 + attention head	0.477740
roberta-large + attention head	0.473006
roberta-large-squad2 + attention head	0.471116
roberta-large + mean pool head	0.474779

The RMSE that averages all of the above is 0.46214926662874833

The RMSE that averages all of the above is 0.46127848800757043

Text features were created based on this notebook.

The above features were selected using the Stepwise method. I removed features to account for overfit.

Some custom heads
- LSTM head
- GRU head
- 1DCNN head
- roberta-base + mean pool head
Concat last 2 hidden state layers
SWA
Weight initialize of custom heads and regression layer