Knewton / edm2016

Code for replicating results in our EDM2016 paper
Apache License 2.0
59 stars 32 forks source link

problem_id key error #9

Open ghost opened 6 years ago

ghost commented 6 years ago

Hi!

I'm trying to reproduce the results of the EDM2016 paper to see if HIRT is viable (rapid enough) for real time computation, and it seems that I have some issues with the code (I didn't modify it). When I try to run HIRT with the Bridge to Algebra dataset, I have the following error:

...
File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/cli.py", line 163, in irt
    data, _, _, _, _ = load_data(data_file, source, data_opts)
  File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/data/wrapper.py", line 77, in load_data
    min_interactions_per_user=data_opts.min_interactions_per_user)
  File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/data/kddcup.py", line 99, in load_data
    data = data.sort(sort_keys)
...
KeyError: u'problem_id'

I don't know why this happens, any help would be appreciated! Thanks

khwilson commented 6 years ago

It's been a while since I looked at this code, but IIRC a couple of the data sets had slightly strange column names. Do the other kddcup data sets work out of the box? If so, you can try just changing the relevant column heading.