Closed statsu1990 closed 4 years ago
Model_v1_5_0 (linear head) : cv 0.54651 (only posi and nega)
3 hidden layer, average, not learnable Model_v1_8_0 (multi linear head, 768-128-2) : cv 0.54918 (only posi and nega) Model_v1_8_1 (conv head, k=3, n_conv=1, 768-768) : cv 0.55192 (only posi and nega), lb 0.709
12 hidden layer, average, not learnable Model_v1_8_2 (multi linear head, 768-128-2) : cv 0.55677 (only posi and nega), lb 0.709 Model_v1_8_3 (conv head, k=3, n_conv=1, 768-768) : cv 0.55003 (only posi and nega)
other condition dropout=0.1 implement consideration of text_areas implement remove_excessive_padding train only positive and negative label smoothing 0.05 lr 1e-5 different learning rate (x30)
To consider the impact of the surrounding.