nd-ball / py-irt

Bayesian IRT models in Python
MIT License
118 stars 44 forks source link

Fitting an IRT model on the SQuAD dataset #10

Closed pk1130 closed 3 years ago

pk1130 commented 3 years ago

Thanks for your responses to issue #9 @EntilZha and @jplalor! We saw the description underneath the from_jsonlines() function, but are still unsure of how to convert the SQuAD dataset into the format required for the function (since the function description asks for the dataset to be in a specific format and I'm not sure if extracting just QA pairs and their corresponding labels from the SQuAD dataset would be enough). Could one of you please do a small test run with maybe 5 rows from the SQuAD dataset as an example?

EntilZha commented 3 years ago

Thanks for reaching out! It would help to know about what you are aiming to accomplish, but here is one way to think about what input is required. Imagine a matrix where rows are subjects/systems/submission, columns are items/examples/questions, and entries are whether that particular subject got that item correct.

So doing this using only the squad questions + answers, doesn't quite make sense, you also need the scored predictions of at least several systems. The data below is the json lines version of the matrix I described, where each row is one squad system's scored predictions on the dev set (we can't release test data).

For SQuAD development data, you can find the py-irt compatible file here https://obj.umiacs.umd.edu/acl2021-leaderboard/data/squad-pyirt.jsonlines. You should be able to train the model with something like: py-irt train 4pl ~/Downloads/squad-pyirt.jsonlines /tmp/test-4pl/

pk1130 commented 3 years ago

Hi @EntilZha! Thanks for your response! That makes a lot of sense. The overall objective that we are trying to achieve is improving QA systems. Specifically, we are trying to use IRT to identify errors in training questions, correct those errors and retrain the system to ideally see better results :)

After reading your response, this is what I gather: From the scored predictions of several systems on the SQuAD data, we will be able to do some exploratory data analysis to see on which questions systems consistently fail to answer/answer incorrectly and hopefully try to find errors in questions from there?

Edit on 06/25: I tried to run the command multiple times after installing the package globally vs. inside a virtual environment, but I don't seem to be able to run the py-irt as a command. I keep getting an error saying bash: py-irt: command not found. I was wondering if there were any executables/binaries that I had to export to the environment and how I could do that, since I couldn't find any after digging around the package? @EntilZha @jplalor could you please elaborate on running the command after installing the package and its dependencies at your earliest convenience? Thanks a ton!

Xiaoqianhou commented 3 years ago

Actually we all get the same problem when running the py-irt train 4pl ~/Downloads/squad-pyirt.jsonlines /tmp/test-4pl/ command and the error is :'py-irt: command not found' Do you have any idea why this happens? And could you please tell us how to fix it ? Thank you very much

EntilZha commented 3 years ago

Are you installing with pip or from the GitHub master branch? The CLI isn’t on PyPI yet so for now have to install from master.

pk1130 commented 3 years ago

We tried installing with PyPi and running the command, but it didn't work (because of the reason above). So, we cloned the repo and ran the cli.py after much debugging. The good news is that the model fits well on the jsonlines data that you linked to, above and we are able to see results. The bad news is that the file structure is slightly haphazard and some files need to be moved from sub-directories to the parent directory in order for cli.py to run and the model to train on the SQuAD dev data. I'll link the other issue I open regarding the file structure and update the docs and submit a pull request by tonight. Thanks for the help @EntilZha!

EntilZha commented 3 years ago

FYI, you can also pip install from github branches directly like here https://adamj.eu/tech/2019/03/11/pip-install-from-a-git-repository/

Its hard to give suggestions about the issues you ran into, without knowing more about the errors. Generally, its really import to list these things:

  1. What did you try to run (e.g., run py-irt)
  2. What did you expect to happen
  3. What actually happened
  4. Context information, what version of python, from what directory, etc.

It might not be necessary to change the file structure, but I can't really advise without more info

pk1130 commented 3 years ago

Thanks for the response, @EntilZha! Apologies for not detailing the issue as outlined by you. I'll keep that in mind for future issues. To summarize this issue as per given outline:

  1. What I tried to run: Tried installing py-irt directly from PyPi using the command pip install py-irt. Installation happened successfully, but when trying to run the command you provided: py-irt train 4pl ~/Downloads/squad-pyirt.jsonlines /tmp/test-4pl/, it threw an error saying bash: py-irt: command not found.
  2. What I expected to happen: Expected the above command to run successfully, model training to start correctly, and epochs and training loss to show up without errors or other issues.
  3. What actually happened: bash: py-irt: command not found showed up with no other output. (Possibly due to what you said earlier with cli.py not being on PyPI yet)
  4. Context: Trying to use the py-irt package to better understand training data from the NQ dataset to try and fix training errors with the end goal of improving QA systems as part of a research internship project with Dr. Jordan Boyd-Graber. Python version: 3.6.9. Created a directory and virtual env using venv for this project, did a pip install inside this directory.

When this did not work, we moved cli.py out of the /py_irt/ directory and into the main directory inside the repo, and ran this command in the main directory: python cli.py train 4pl ~/path/to/squad-pyirt.jsonlines /where/you/want/to/store/output_4pl. This made it work and the model works well with the following output: image

Since it worked this way, for the benefit of future users, until cli.py is deployed to PyPI, I'm updating the docs and the file structure in this manner on my fork of the repo as a temporary fix to this issue. Should I open a new issue about the bash; py-irt; command not found error so that it may be addressed in detail there?

Please let me know your thoughts/suggestions/queries regarding the same! Thanks!

Edit: I'm closing this issue now since we were able to successfully fit an IRT model on the scored predictions of SQuAD. Do let me know if I should open a new issue about the error that I talked about above.

EntilZha commented 3 years ago

Re docs: I think the problem you ran into is installing from PyPI versus the master branch. I think we should:

  1. Deploy a new version to PyPI, @jplalor has the credentials, so I can't do that. It should be relatively easy to configure poetry to do that and then run poetry build followed poetry publish after bumping version numbers.
  2. Update the docs to indicate what versions the CLI is available on, the best thing would be a changelog indicating this
  3. Add a FAQ to the docs and add this question there

The reason moving the cli.py works is that this is the source file with the CLI, but it is by installing with poetry install that makes it available as py-irt through the command specs in pyproject.toml.

If you could open an issue that can serve as a feature request for doc additions, that would make it easier to track docs we have yet to add

jplalor commented 3 years ago

11 takes care of 1 and 2 above, 3 is still todo