ParikhKadam / bidaf-keras

Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2
GNU General Public License v3.0
64 stars 21 forks source link

How to download and preprocess SQUAD data? #1

Closed RQMRC closed 5 years ago

RQMRC commented 5 years ago

I run your file. But I encountered the following error.

3

Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/drive/My Drive/bidaf-keras-master/bidaf-keras-master/bidaf/main.py", line 17, in main() File "/content/drive/My Drive/bidaf-keras-master/bidaf-keras-master/bidaf/main.py", line 11, in main train_generator, validation_generator = load_data_generators(batch_size=16, emdim=emdim, shuffle=True) File "/content/drive/My Drive/bidaf-keras-master/bidaf-keras-master/bidaf/scripts/data_generator.py", line 5, in load_data_generators train_generator = BatchGenerator('train', batch_size, emdim, shuffle) File "/content/drive/My Drive/bidaf-keras-master/bidaf-keras-master/bidaf/scripts/batch_generator.py", line 24, in init with open(self.span_file, 'r', encoding='utf-8') as f: FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/bidaf-keras-master/bidaf-keras-master/bidaf/scripts/../data/squad/train.span'

The error is due to a lack of data. I downloaded this dataset and added to the project. But I still have the same error. In your opinion, you have divided the data into three parts. But how?

ParikhKadam commented 5 years ago

@RQMRC Sorry to be late.. was busy.

I see your problem. That's because the very file this script uses is not the SQUADv1.1 json file but the preprocessed files generated after some processing on SQUADv1.1 json files. I recommend you to run data_download_and_preprocess() from bidaf-keras/bidaf/scripts/preprocess.py

I will update the main.py very soon so that users don't need to call this function by themselves. I had this change in mind but while I was busy solving other issues in this code, I forgot to do this.

Very sorry for the problem.. Will solve this soon... Till then, just call that function manually and it will generate all the files you need. And then you can simple go on and run this module.

ParikhKadam commented 5 years ago

@RQMRC I have added code to the __main__.py according to which your problem should be solved. Check the latest code and report if the problem still persists.

RQMRC commented 5 years ago

hi. Thank you for answering

On Sun, Feb 24, 2019 at 1:17 PM Kadam Parikh notifications@github.com wrote:

@RQMRC https://github.com/RQMRC I have added code to the main.py according to which your problem should be solved. Check the latest code and report if the problem still persists.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ParikhKadam/bidaf-keras/issues/1#issuecomment-466755539, or mute the thread https://github.com/notifications/unsubscribe-auth/AtZhwj-Zd9ELd-6lYHm0BWchdv4PLOMiks5vQl_BgaJpZM4a5uQ7 .

-- سربلند و سرافراز باشید.

ParikhKadam commented 5 years ago

Hey @RQMRC I have verified that this issue is now solved. If you are still facing any problems with this, please download the latest version of code from the master branch and it will work.. Thank you..