allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)
Apache License 2.0
404 stars 132 forks source link

Span prediction -- Pull request #2 after rebase @matt-gardner #92

Closed JohannesMaxWel closed 8 years ago

matt-gardner commented 8 years ago

It looks like the reason semaphore failed was because of pylint errors. If you want to run this yourself, so you can see errors before pushing a commit to the PR, you can do that with this command: pylint -d locally-disabled -f colorized src/main/python/deep_qa. If you need help with any of it, or if the pylint errors are too cryptic, let me know, and I can help.

JohannesMaxWel commented 8 years ago

OK, I'll look into what failed and see if I can fix it myself.

JohannesMaxWel commented 8 years ago

OK, interestingly pylint complains about several import errors. Might this be due to re-naming issues?

************* Module deep_qa.data.text_instance
E:  3, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.data.indexed_instance
E:  2, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.data.tokenizer
E:  3, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.lstm_solver
E:  2, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.differentiable_search
E:  8, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.layers.encoders
E:  6, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.multiple_choice_memory_network
E:  2, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.memory_network
E:  3, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.tree_lstm_solver
E:  2, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.question_answer_memory_network
E:  2, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.solvers.pretraining.snli_pretrainer
E:  1, 0: Unable to import 'overrides' (import-error)
************* Module deep_qa.span_prediction.loaders

But pylint also complains that it cannot use numpy members, e.g.

E: 16, 0: Module 'numpy.random' has no 'seed' member (no-member)
E: 98,33: Module 'numpy' has no 'log' member (no-member)
E:199,32: Module 'numpy.random' has no 'randint' member (no-member)

Digging a little, I found that this is a standard problem with pylint see here. There's basically two options proposed: i) updating pylint ii) telling pylint to ignore numpy.

matt-gardner commented 8 years ago

For the overrides import errors, you should run pip install -r requirements.txt, to make sure you have all of the required libraries installed (you might need to do this with pip3 instead of pip, depending on how you installed python3).

For the numpy problems, the way I've been handling that is like this: https://github.com/allenai/deep_qa/blob/7cd198cd3097fe543b5a1f6b29692fec3fd88923/src/main/python/run_solver.py#L10. If you have a whole lot of these, we might need to do something different, like # pylint: ignored-modules=numpy, though I'm not certain that will work. You could also try adding the extension-pkg-whitelist=numpy setting to our pylintrc file, as noted here: http://stackoverflow.com/a/31465070/1467533.

JohannesMaxWel commented 8 years ago

I'm not sure how to proceed regarding the use of relative imports.

Running span_model.py by itself works fine when executed from its containing directory. Pylint however runs span_model.py from some directory higher up the hierarchy and complains E: 15, 0: Unable to import 'loaders' (import-error).

Switching to relative imports fixes this (from .loaders import load_questions) and pylint is happy. But how do I run my script now? Running it with relative imports from its containing directory yields SystemError: Parent module '' not loaded, cannot perform relative import

Now I want to avoid manually setting path variables. I'm not sure how to proceed.

I saw you mentioned something related here, did you solve a similar problem?

matt-gardner commented 8 years ago

Going forward, it's best if you have most of the code in modules and classes that take parameters for specific filenames and whatever else they need. Those can use relative imports. Then you have a simple script that calls a method from somewhere in that library, giving it specific filenames and other parameters. This is how most of the deep_qa code is set up, with a few scripts outside of the main directory that have direct imports instead of relative imports, and it's the recommended way of structuring python code (http://stackoverflow.com/a/14132912/1467533).

If you don't want to restructure the code to make that work now, you could take the route I took in the language modeling code and just disable the error on that line: https://github.com/allenai/deep_qa/blob/7cd198cd3097fe543b5a1f6b29692fec3fd88923/src/main/python/deep_qa/sentence_corruption/language_model.py#L9-L11. If you do this, be sure you add a TODO that it should be fixed some time.

JohannesMaxWel commented 8 years ago

OK, did it and pylint doesn't produce errors any more (locally).