mila-iqia / blocks-examples

Examples and scripts using Blocks
MIT License
147 stars 94 forks source link

Error when I re-implement example for machine translation #36

Closed xishazgh closed 9 years ago

xishazgh commented 9 years ago

Hi everyone,

I get the following error:

Training status: batch_interrupt_received: False epoch_interrupt_received: False epoch_started: True epochs_done: 0 iterations_done: 48 received_first_batch: True resumed_from: f899e453ab7e4d66acbe7558d9dcedaf training_started: True Log records from the iteration 48: decoder_cost_cost: 406.997089365

ERROR:blocks.main_loop:Error occured during training.

Blocks will attempt to run on_error extensions, potentially saving data, before exiting and reraising the error. Note that the usual after_training extensions will not be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately. Traceback (most recent call last): File "main.py", line 45, in get_dev_stream(**configuration), args.bokeh) File "../machine_translation/init.py", line 175, in main main_loop.run() File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 196, in run reraise_as(e) File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/utils/init.py", line 225, in reraise_as six.reraise(type(new_exc), new_exc, orig_exc_traceback) File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 182, in run while self.run_epoch(): File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 231, in run_epoch while self._run_iteration(): File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 244, in _run_iteration batch = next(self.epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).__next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 138, in get_data data = next(self.child_epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 30, in next data = self.data_stream.get_data(next(self.request_iterator)) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 591, in get_data data, next(self.child_epoch_iterator)): File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 639, in get_data return self.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 633, in get_data data = next(self.child_epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 214, in get_data data = next(self.child_epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 30, in next data = self.data_stream.get_data(next(self.request_iterator)) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 591, in get_data data, next(self.child_epoch_iterator)): File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 215, in get_data image = self.mapping(data) File "../machine_translation/stream.py", line 83, in call for x in sentence_pair[0]], IndexError: tuple index out of range

Original exception: IndexError: tuple index out of range

the training data is download from WMT15 cs-en following prepare_data.py description. and the version of blocks is 0.0.1

Could anyone help me with it?

rizar commented 9 years ago

Are you using the stable branch?

xishazgh commented 9 years ago

no ,I think I use the master branch.

sathishreddy commented 9 years ago

I also got this error when I use master branch ... it works with stable branch. The problem is when it reaches end of file the stable branch raises stopIteration exception, based on this exception the main loop reset it data iterators. This behavior is not happening with latest blocks code. It will send empty sentence to stream.py and __oov function raises this exception. Thanks, Sathish.

On Wed, 19 Aug 2015 at 15:16 xishazgh notifications@github.com wrote:

no ,I think I use the master branch.

— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132517563 .

xishazgh commented 9 years ago

Thanks , now I try to use the stable branch.

rizar commented 9 years ago

@orhanf , can you please help us here?

xishazgh commented 9 years ago

Hello everyone, just following suggestion above, It is still not work when I use stable branch.

orhanf commented 9 years ago

it looks like a few more people are also getting the same error, but at different iterations. This is probably caused by the preprocessing script, i have a shuffled data file which i also get this error now, so i'll come up with a fix shortly.

orhanf commented 9 years ago

@xishazgh is it possible for you to share your shuffled data files ?

xishazgh commented 9 years ago

sorry, the data be covered when I use the stable branch for re-train.

2015年8月19日星期三,Orhan Firat notifications@github.com 写道:

@xishazgh https://github.com/xishazgh is it possible for you to share your shuffled data files ?

— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132747692 .

发自移动版 Gmail

This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie

Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie

Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman

orhanf commented 9 years ago

@xishazgh can you elaborate on that please?

xishazgh commented 9 years ago

ok. I will reproduction that shuffle data tomorrow and send to you

2015年8月19日星期三,Orhan Firat notifications@github.com 写道:

@xishazgh https://github.com/xishazgh can you elaborate on that?

— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132752997 .

发自移动版 Gmail

This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie

Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie

Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman

sathishreddy commented 9 years ago

Hi all, This error can be easily reproducible using small data set. For example take 1000 sentences from cs-en and put batch size to 50 then at 20th iteration you will get this error. I tried this using 2 sequence to sequence datasets, both produced error at the time of completion of first epoch. Use small data size to reproduce error quickly. Thanks, Sathish

On Thu, Aug 20, 2015, 1:18 AM xishazgh notifications@github.com wrote:

ok. I will reproduction that shuffle data tomorrow and send to you

2015年8月19日星期三,Orhan Firat notifications@github.com 写道:

@xishazgh https://github.com/xishazgh can you elaborate on that?

— Reply to this email directly or view it on GitHub < https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132752997

.

发自移动版 Gmail

This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie

Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie

Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman

Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132757484 .

orhanf commented 9 years ago

@sathishreddy that's weird since i am running a model with a small corpus and already reached 40th epoch without any error, so that means we might have a version problem, or the chunk that i am using for training is not erroneous (which supposedly to be). And i am on master branch btw.

sathishreddy commented 9 years ago

@Orhan I tried above experiment 2 weeks back with latest code from github. Due to above problem, I moved back to stable version and re-ran the code. Some version may be has this problem. will check and report.

One more request: Currently word embedding are initialized randomly. Is there any way initialize word embedding with already learned word representation (ex: word2vec). I am new to blocks and I am not very confident in doing this. Thanks, Sathish.

On Thu, 20 Aug 2015 at 07:36 Orhan Firat notifications@github.com wrote:

@sathishreddy https://github.com/sathishreddy that's weird since i am running a model with a small corpus and already reached 40th epoch without any error, so that means we might have a version problem, or the chunk that i am using for training is not erroneous (which supposedly to be). And i am on master branch btw.

— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132851180 .

xishazgh commented 9 years ago

@orhanf I re-ran code for new shuffle data ,and when iterations reached 25th the error happened. I am on the Master branch . Now I send the shuffle data to you

On 20 August 2015 at 03:16, Orhan Firat notifications@github.com wrote:

@xishazgh https://github.com/xishazgh is it possible for you to share your shuffled data files ?

— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132747692 .

This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie

Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie

Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman

fhirschmann commented 9 years ago

I'm on ec50bac/blocks and have not yet encountered this problem.

orhanf commented 9 years ago

sorry for my late response,

@xishazgh i did not receive any files, is it possible for you to upload the files somewhere and share, may be your repo.

@sathishreddy it is quite easy to load a pretrained model (or some of the parameters as in your case). LoadNMT loads everything from the directory specified by config['saveto'] into the main_loop. When you set config['reload'] = True (it is True by default), manager looks into the directory and tries to load params.npz one parameter at a time. What you need to do is to create a params.npz file by your word2vec embeddings, with the field name as -bidirectionalencoder-embeddings.W having the shape as (src_vocab_size, enc_embed) which is (30000, 620) by default. Put this file to your config['saveto'] folder and just run the example. LoadNMT will automatically load your word2vec embeddings and will inform you via logger. Hope this helps,

xishazgh commented 9 years ago

@orhanf The fellowing link is my shuffle data:

https://drive.google.com/a/mydit.ie/file/d/0B7VPdJPqwce2ZDE1RS11Q3ZkSGs/view?usp=sharing https://drive.google.com/a/mydit.ie/file/d/0B7VPdJPqwce2UnFCNTkxSkZ3QXM/view?usp=sharing

sathishreddy commented 9 years ago

@orhanf Thank you :)

On Sat, 29 Aug 2015 at 01:12 Orhan Firat notifications@github.com wrote:

sorry for my late response,

@xishazgh https://github.com/xishazgh i did not receive any files, is it possible for you to upload the files somewhere and share, may be your repo.

@sathishreddy https://github.com/sathishreddy it is quite easy to load a pretrained model (or some of the parameters as in your case). LoadNMT loads everything from the directory specified by config['saveto'] into the main_loop. When you set config['reload'] = True (it is True by default), manager looks into the directory and tries to load params.npz one parameter at a time. What you need to do is to create a params.npz file by your word2vec embeddings, with the field name as -bidirectionalencoder-embeddings.W having the shape as (src_vocab_size, enc_embed) which is (30000, 620) by default. Put this file to your config['saveto'] folder and just run the example. LoadNMT will automatically load your word2vec embeddings and will inform you via logger. Hope this helps,

— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-135870481 .

orhanf commented 9 years ago

Hi everyone, here is the (partial) solution for the ones who are getting this error. It seems like, pip installation of blocks is causing the problem because of fuel.

This solved the error above, but i have no idea about the actual source of it. @rizar what can be the problem? and what should be the permanent fix?

also, thanks to @merc85garcia for letting me remotely debug the error. @ertugruly, please check this issue

bartvm commented 9 years ago

pip install --upgrade git+git://github.com/mila-udem/fuel.git didn't work? That should get you exactly the same result as cloning and then installing with python setup.py develop...

orhanf commented 9 years ago

@bartvm, i haven't tried that but seems like a more elegant way to fix it :)

abergeron commented 9 years ago

pip install --upgrade is not the same a python setup.py develop since tries to upgrade which can lead to problems for some compiled modules (like numpy not finding the blas, ...).

To properly upgrade a fuel install the command should be pip install --upgrade --no-deps git+git://github.com/mila-udem/fuel.git where the --no-deps part tells pip not to upgrade dependencies.

merc85garcia commented 9 years ago

Hello, I had this issue. I had to uninstall the version of fuel I had and install this one: https://github.com/mila-udem/fuel. I also unsinstall and install again blocks and I could make it work. And I trained a new model from scratch.

rizar commented 9 years ago

For everybody who have every had this issue, please use the latest Fuel: it is now fixed there.