Open DongzeHE opened 3 years ago
Can you please give the command you're trying to run and the error that you get?
I am using luigi and here is my command:
luigi --local-scheduler --module qanta.pipeline.buzzer
The problem is in the qanta/pipeline/buzzer.py, it imports test.py
and 'constants.py' from qanta/buzzer/
folder, while these two files are not in the folder.
We need to run in order to train the buzzer.
python qanta/buzzer/train.py
@ihsgnef : can you take a look
@NPSDC : have you tried commenting out the import; perhaps it's not really needed?
@ezubaric I tried that but in the pipeline buzzer, there exists many such dependencies. Also I have noticed it seems that the pipeline is not updated as it seems to call functions that no longer exist in the more updated files under qanta/buzzer
folder.
Calling train.py
enables me to get the buzzer model, though I make changes in the code to make sure that guesser is trained using only the buzztrain data
.
On a different note, I see for the eval protobowl
files are being called. Is there a way to access them?
@NPSDC @DongzeHE Just wanted to let you know I'm happy to help with running & modifying any buzzer related code which I'm primarily responsible for. Feel free to @ me in future issues & pull requests!
The buzzer related luigi pipelines are indeed obsolete. But hopefully the buzzer part can be experimented with independent of the guesser. On my end I usually just run python -m qanta.buzzer.train
or python -m qanta.buzzer.eval
, etc.
The protobowl files are available at
https://pinafore-us-west-2.s3-us-west-2.amazonaws.com/karl/protobowl/protobowl-042818.log.h5
https://pinafore-us-west-2.s3-us-west-2.amazonaws.com/karl/protobowl/protobowl-042818.log.questions.pkl
These files should be stored in [your qb directory]/data/external/datasets/protobowl/
(you can manually create this if it doesn't already exist). These are processed files and should save you some load time.
The buzzer code is a bit outdated since Chainer is no longer being actively developed, but it should be very straightfoward to convert it into Pytorch since the two frameworks have virtually identical APIs. I'd recommend doing that, and keep the feature extraction code (vector_converter
in qanta/buzzer/util.py
).
@ihsgnef Could you guide me on how to download the protobowl files. Using wget/curl on the above URLs is giving me an error. Thanks for the help.
@NPSDC For future issues, please include the complete output (including the command you ran and the error you got). I think in this case the error was caused by permission issues. Please try downloading from the public S3 folder:
wget https://pinafore-us-west-2.s3-us-west-2.amazonaws.com/public/protobowl/protobowl-042818.log
wget https://pinafore-us-west-2.s3-us-west-2.amazonaws.com/public/protobowl/protobowl-042818.log.h5
wget https://pinafore-us-west-2.s3-us-west-2.amazonaws.com/public/protobowl/protobowl-042818.log.questions.pkl
I cannot run buzzer because of the missing file