castorini / DeeBERT

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
Apache License 2.0
151 stars 23 forks source link

Replication results #4

Closed wongalvis14 closed 3 years ago

wongalvis14 commented 4 years ago

Ubuntu 18.10 Python 3.7.7 CUDA 10.1

train.sh (time 0.09) acc = 0.8676470588235294 f1 = 0.906896551724138

train_highway.sh (time 0.13) acc = 0.8676470588235294 f1 = 0.9072164948453608

eval_entropy.sh 0.0 Result: 0.9072164948453608 Eval time: 14.078018426895142 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 408} Expected saving 1.0

0.001 Result: 0.9072164948453608 Eval time: 13.798500061035156 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 408} Expected saving 1.0

0.005 Result: 0.9072164948453608 Eval time: 14.127072811126709 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 408} Expected saving 1.0

0.01 Result: 0.9072164948453608 Eval time: 13.872061491012573 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 408} Expected saving 1.0

0.05 Result: 0.9072164948453608 Eval time: 13.464979648590088 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 49, 11: 62, 12: 297} Expected saving 0.9673202614379085

0.1 Result: 0.9072164948453608 Eval time: 12.848443746566772 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 4, 7: 0, 8: 1, 9: 10, 10: 110, 11: 58, 12: 225} Expected saving 0.9313725490196079

0.15 Result: 0.9090909090909091 Eval time: 12.455684423446655 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 21, 7: 13, 8: 5, 9: 19, 10: 112, 11: 54, 12: 184} Expected saving 0.8884803921568627

0.2 Result: 0.9094017094017094 Eval time: 11.937445402145386 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 58, 7: 19, 8: 3, 9: 23, 10: 107, 11: 44, 12: 154} Expected saving 0.8402777777777778

0.3 Result: 0.8989898989898989 Eval time: 10.921518564224243 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 0, 5: 9, 6: 120, 7: 21, 8: 1, 9: 22, 10: 88, 11: 46, 12: 101} Expected saving 0.7589869281045751

0.4 Result: 0.8870151770657673 Eval time: 9.648621082305908 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 1, 5: 69, 6: 117, 7: 18, 8: 3, 9: 31, 10: 72, 11: 37, 12: 60} Expected saving 0.6795343137254902

0.5 Result: 0.8795986622073578 Eval time: 8.683350801467896 Exit layer counter {1: 0, 2: 0, 3: 0, 4: 61, 5: 105, 6: 86, 7: 19, 8: 5, 9: 27, 10: 43, 11: 31, 12: 31} Expected saving 0.5808823529411765

0.6 Result: 0.8516129032258065 Eval time: 6.725062131881714 Exit layer counter {1: 1, 2: 56, 3: 20, 4: 158, 5: 35, 6: 66, 7: 12, 8: 4, 9: 13, 10: 16, 11: 13, 12: 14} Expected saving 0.42483660130718953

0.7 Result: 0.8122270742358079 Eval time: 2.7605879306793213 Exit layer counter {1: 408, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0} Expected saving 0.08333333333333333

wongalvis14 commented 4 years ago

A quick plot of result against expected saving

image

wongalvis14 commented 4 years ago
import numpy as np
import matplotlib.pyplot as plt

TIME=1
EXP_SAV=2
RESULT=3

entropys = ['0.0', '0.001', '0.005', '0.01', '0.05', '0.1', '0.15', '0.2', '0.3', '0.4', '0.5', '0.6', '0.7']
times = []
exp_savs = []
results = []

for ent in entropys:
    filename = 'entropy_' + ent + '.npy'
    data = np.load(filename, allow_pickle=True)
    times.append(data[TIME])
    exp_savs.append(1 - data[EXP_SAV])
    results.append(data[RESULT])

plt.plot(exp_savs, results)
plt.show()
wongalvis14 commented 4 years ago

I got the same results after nuking the saved models and plots and run the same scripts

I'm wondering what's the cause of the increased accuracy between 0.1 and 0.2 expected saving, reduced overfitting in the deeper layers?

ji-xin commented 4 years ago

The plot looks good. And yes, the increasing accuracy between 0.1 and 0.2 is likely due to reduced overfitting.

wongalvis14 commented 4 years ago

Replicated twice on hydra Python 3.7.7 CUDA 10.0

train.sh f1 = 0.9112627986348123

train_highway.sh f1 = 0.9020979020979022

eval_entropy.sh image