duolingo / halflife-regression

MIT License
494 stars 88 forks source link

MAE of fixed Pbar #1

Closed reynoldscem closed 7 years ago

reynoldscem commented 7 years ago

I have tried to reproduce your figure of 0.175 MAE for the fixed p-bar of 0.859. On neither the test set, nor the training set can I get this value. Could you detail for me how you have calculated it?

For context I am seeing if I can reproduce some of your results, or try various different modules on the same set of data. Firstly, I would like to reproduce your results as a baseline, but this one has eluded me.

Thanks.

burrsettles commented 7 years ago

This was calculated using research-grade pre-release code, which perhaps got cut out of the public release. But this is easy enough to do with a Python script:

import fileinput
import sys
p = 0.859
s = 0.
n = 0.
for line in fileinput.input():
  try:
    s += abs(p - float(line))
    n += 1
  except:
    continue
print s/n

You can run it using the following command:

gzcat learning_traces.13m.csv.gz | cut -d ',' -f 1 | python fixedmae.py

The result is slightly different (0.199606186716) because this is making predictions over the whole data set and the published result is just for the test set (last 10%). But you get the idea. 😄

Good luck!