Closed xishazgh closed 9 years ago
Are you using the stable branch?
no ,I think I use the master branch.
I also got this error when I use master branch ... it works with stable branch. The problem is when it reaches end of file the stable branch raises stopIteration exception, based on this exception the main loop reset it data iterators. This behavior is not happening with latest blocks code. It will send empty sentence to stream.py and __oov function raises this exception. Thanks, Sathish.
On Wed, 19 Aug 2015 at 15:16 xishazgh notifications@github.com wrote:
no ,I think I use the master branch.
— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132517563 .
Thanks , now I try to use the stable branch.
@orhanf , can you please help us here?
Hello everyone, just following suggestion above, It is still not work when I use stable branch.
it looks like a few more people are also getting the same error, but at different iterations. This is probably caused by the preprocessing script, i have a shuffled data file which i also get this error now, so i'll come up with a fix shortly.
@xishazgh is it possible for you to share your shuffled data files ?
sorry, the data be covered when I use the stable branch for re-train.
2015年8月19日星期三,Orhan Firat notifications@github.com 写道:
@xishazgh https://github.com/xishazgh is it possible for you to share your shuffled data files ?
— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132747692 .
发自移动版 Gmail
This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie
Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie
Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman
@xishazgh can you elaborate on that please?
ok. I will reproduction that shuffle data tomorrow and send to you
2015年8月19日星期三,Orhan Firat notifications@github.com 写道:
@xishazgh https://github.com/xishazgh can you elaborate on that?
— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132752997 .
发自移动版 Gmail
This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie
Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie
Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman
Hi all, This error can be easily reproducible using small data set. For example take 1000 sentences from cs-en and put batch size to 50 then at 20th iteration you will get this error. I tried this using 2 sequence to sequence datasets, both produced error at the time of completion of first epoch. Use small data size to reproduce error quickly. Thanks, Sathish
On Thu, Aug 20, 2015, 1:18 AM xishazgh notifications@github.com wrote:
ok. I will reproduction that shuffle data tomorrow and send to you
2015年8月19日星期三,Orhan Firat notifications@github.com 写道:
@xishazgh https://github.com/xishazgh can you elaborate on that?
— Reply to this email directly or view it on GitHub < https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132752997
.
发自移动版 Gmail
This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie
Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie
Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman
—
Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132757484 .
@sathishreddy that's weird since i am running a model with a small corpus and already reached 40th epoch without any error, so that means we might have a version problem, or the chunk that i am using for training is not erroneous (which supposedly to be). And i am on master branch btw.
@Orhan I tried above experiment 2 weeks back with latest code from github. Due to above problem, I moved back to stable version and re-ran the code. Some version may be has this problem. will check and report.
One more request: Currently word embedding are initialized randomly. Is there any way initialize word embedding with already learned word representation (ex: word2vec). I am new to blocks and I am not very confident in doing this. Thanks, Sathish.
On Thu, 20 Aug 2015 at 07:36 Orhan Firat notifications@github.com wrote:
@sathishreddy https://github.com/sathishreddy that's weird since i am running a model with a small corpus and already reached 40th epoch without any error, so that means we might have a version problem, or the chunk that i am using for training is not erroneous (which supposedly to be). And i am on master branch btw.
— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132851180 .
@orhanf I re-ran code for new shuffle data ,and when iterations reached 25th the error happened. I am on the Master branch . Now I send the shuffle data to you
On 20 August 2015 at 03:16, Orhan Firat notifications@github.com wrote:
@xishazgh https://github.com/xishazgh is it possible for you to share your shuffled data files ?
— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-132747692 .
This email originated from DIT. If you received this email in error, please delete it from your system. Please note that if you are not the named addressee, disclosing, copying, distributing or taking any action based on the contents of this email or attachments is prohibited. www.dit.ie
Is ó ITBÁC a tháinig an ríomhphost seo. Má fuair tú an ríomhphost seo trí earráid, scrios de do chóras é le do thoil. Tabhair ar aird, mura tú an seolaí ainmnithe, go bhfuil dianchosc ar aon nochtadh, aon chóipeáil, aon dáileadh nó ar aon ghníomh a dhéanfar bunaithe ar an ábhar atá sa ríomhphost nó sna hiatáin seo. www.dit.ie
Tá ITBÁC ag aistriú go Gráinseach Ghormáin – DIT is on the move to Grangegorman http://www.dit.ie/grangegorman
I'm on ec50bac/blocks and have not yet encountered this problem.
sorry for my late response,
@xishazgh i did not receive any files, is it possible for you to upload the files somewhere and share, may be your repo.
@sathishreddy it is quite easy to load a pretrained model (or some of the parameters as in your case).
LoadNMT
loads everything from the directory specified by config['saveto']
into the main_loop
.
When you set config['reload'] = True
(it is True
by default), manager looks into the directory and tries to load params.npz
one parameter at a time. What you need to do is to create a params.npz
file by your word2vec embeddings, with the field name as -bidirectionalencoder-embeddings.W
having the shape as (src_vocab_size, enc_embed)
which is (30000, 620)
by default. Put this file to your config['saveto']
folder and just run the example. LoadNMT
will automatically load your word2vec embeddings and will inform you via logger. Hope this helps,
@orhanf The fellowing link is my shuffle data:
https://drive.google.com/a/mydit.ie/file/d/0B7VPdJPqwce2ZDE1RS11Q3ZkSGs/view?usp=sharing https://drive.google.com/a/mydit.ie/file/d/0B7VPdJPqwce2UnFCNTkxSkZ3QXM/view?usp=sharing
@orhanf Thank you :)
On Sat, 29 Aug 2015 at 01:12 Orhan Firat notifications@github.com wrote:
sorry for my late response,
@xishazgh https://github.com/xishazgh i did not receive any files, is it possible for you to upload the files somewhere and share, may be your repo.
@sathishreddy https://github.com/sathishreddy it is quite easy to load a pretrained model (or some of the parameters as in your case). LoadNMT loads everything from the directory specified by config['saveto'] into the main_loop. When you set config['reload'] = True (it is True by default), manager looks into the directory and tries to load params.npz one parameter at a time. What you need to do is to create a params.npz file by your word2vec embeddings, with the field name as -bidirectionalencoder-embeddings.W having the shape as (src_vocab_size, enc_embed) which is (30000, 620) by default. Put this file to your config['saveto'] folder and just run the example. LoadNMT will automatically load your word2vec embeddings and will inform you via logger. Hope this helps,
— Reply to this email directly or view it on GitHub https://github.com/mila-udem/blocks-examples/issues/36#issuecomment-135870481 .
Hi everyone, here is the (partial) solution for the ones who are getting this error.
It seems like, pip
installation of blocks
is causing the problem because of fuel
.
fuel
python setup.py develop
in fuel
directory This solved the error above, but i have no idea about the actual source of it. @rizar what can be the problem? and what should be the permanent fix?
also, thanks to @merc85garcia for letting me remotely debug the error. @ertugruly, please check this issue
pip install --upgrade git+git://github.com/mila-udem/fuel.git
didn't work? That should get you exactly the same result as cloning and then installing with python setup.py develop
...
@bartvm, i haven't tried that but seems like a more elegant way to fix it :)
pip install --upgrade
is not the same a python setup.py develop
since tries to upgrade which can lead to problems for some compiled modules (like numpy not finding the blas, ...).
To properly upgrade a fuel install the command should be pip install --upgrade --no-deps git+git://github.com/mila-udem/fuel.git
where the --no-deps
part tells pip not to upgrade dependencies.
Hello, I had this issue. I had to uninstall the version of fuel I had and install this one: https://github.com/mila-udem/fuel. I also unsinstall and install again blocks and I could make it work. And I trained a new model from scratch.
For everybody who have every had this issue, please use the latest Fuel: it is now fixed there.
Hi everyone,
I get the following error:
Training status: batch_interrupt_received: False epoch_interrupt_received: False epoch_started: True epochs_done: 0 iterations_done: 48 received_first_batch: True resumed_from: f899e453ab7e4d66acbe7558d9dcedaf training_started: True Log records from the iteration 48: decoder_cost_cost: 406.997089365
ERROR:blocks.main_loop:Error occured during training.
Blocks will attempt to run on_error extensions, potentially saving data, before exiting and reraising the error. Note that the usual after_training extensions will not be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately. Traceback (most recent call last): File "main.py", line 45, in get_dev_stream(**configuration), args.bokeh) File "../machine_translation/init.py", line 175, in main main_loop.run() File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 196, in run reraise_as(e) File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/utils/init.py", line 225, in reraise_as six.reraise(type(new_exc), new_exc, orig_exc_traceback) File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 182, in run while self.run_epoch(): File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 231, in run_epoch while self._run_iteration(): File "/usr/local/lib/python2.7/dist-packages/blocks-0.0.1-py2.7.egg/blocks/main_loop.py", line 244, in _run_iteration batch = next(self.epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).__next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 138, in get_data data = next(self.child_epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 30, in next data = self.data_stream.get_data(next(self.request_iterator)) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 591, in get_data data, next(self.child_epoch_iterator)): File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 639, in get_data return self.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 633, in get_data data = next(self.child_epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 214, in get_data data = next(self.child_epoch_iterator) File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 30, in next data = self.data_stream.get_data(next(self.request_iterator)) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 591, in get_data data, next(self.child_epoch_iterator)): File "/usr/local/lib/python2.7/dist-packages/six.py", line 535, in next return type(self).next(self) File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/iterator.py", line 32, in next data = self.data_stream.get_data() File "/usr/local/lib/python2.7/dist-packages/fuel-0.0.1-py2.7-linux-x86_64.egg/fuel/transformers/init.py", line 215, in get_data image = self.mapping(data) File "../machine_translation/stream.py", line 83, in call for x in sentence_pair[0]], IndexError: tuple index out of range
Original exception: IndexError: tuple index out of range
the training data is download from WMT15 cs-en following prepare_data.py description. and the version of blocks is 0.0.1
Could anyone help me with it?