lisa-groundhog / GroundHog

Library for implementing RNNs with Theano
BSD 3-Clause "New" or "Revised" License
598 stars 229 forks source link

IOError when saving model on disk #24

Closed DmitryKey closed 9 years ago

DmitryKey commented 9 years ago

Hi,

I'm trying to train a translation model on mac 64 bit using experiments/nmt/train.py

The options:

python train.py --proto=prototype_encdec_state --state ru-data.py

where ru-data.py is:

dict(
  source=["ru-en/binarized.ru.shuf.h5"],
  target="ru-en/binarized_text.en.shuf.h5",
  word_indx="ru-en/vocab.ru.pkl",
  word_indx_trgt="ru-en/vocab.en.pkl",
  indx_word="ru-en/ivocab.ru.pkl",
  indx_word_target="ru-en/ivocab.en.pkl",
  reload=False
)

export THEANO_FLAGS=floatX=float32

The IO error happens upon model saving:

2014-12-09 21:37:31,407: __main__: DEBUG: Load data
2014-12-09 21:37:31,407: __main__: DEBUG: Compile trainer
2014-12-09 21:37:32,011: groundhog.trainer.SGD_adadelta: DEBUG: Constructing grad function
2014-12-09 21:37:32,253: groundhog.trainer.SGD_adadelta: DEBUG: Compiling grad function
2014-12-09 21:38:31,497: groundhog.trainer.SGD_adadelta: DEBUG: took 59.2441170216
2014-12-09 21:38:35,031: __main__: DEBUG: Run training
Validation computed every 500
Saving the model...
Model saved, took 4.50665020943
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "~/projects/machine_translation/rnn/GroundHog/groundhog/datasets/TM_dataset.py", line 175, in run
    target_table = tables.open_file(diter.target_file, 'r', driver=driver)
  File "/Library/Python/2.7/site-packages/tables/file.py", line 318, in open_file
    return File(filename, mode, title, root_uep, filters, **kwargs)
  File "/Library/Python/2.7/site-packages/tables/file.py", line 791, in __init__
    self._g_new(filename, mode, **params)
  File "tables/hdf5extension.pyx", line 359, in tables.hdf5extension.File._g_new (tables/hdf5extension.c:3875)
  File "/Library/Python/2.7/site-packages/tables/utils.py", line 157, in check_file_access
    raise IOError("``%s`` does not exist" % (filename,))
IOError: ``r`` does not exist

Am I missing something?

nouiz commented 9 years ago

I do not use GroundHog, so just a wild guess, do the directory exist? If not, create it manually. Maybe it do not get created.

On Tue, Dec 9, 2014 at 2:47 PM, Dmitry Kan notifications@github.com wrote:

Hi,

I'm trying to train a translation model on mac 64 bit using experiments/nmt/train.py

The options:

python train.py --proto=prototype_encdec_state --state ru-data.py

where ru-data.py is:

dict( source=["ru-en/binarized.ru.shuf.h5"], target="ru-en/binarized_text.en.shuf.h5", word_indx="ru-en/vocab.ru.pkl", word_indx_trgt="ru-en/vocab.en.pkl", indx_word="ru-en/ivocab.ru.pkl", indx_word_target="ru-en/ivocab.en.pkl", reload=False )

export THEANO_FLAGS=floatX=float32

The IO error happens upon model saving:

2014-12-09 21:37:31,407: main: DEBUG: Load data 2014-12-09 21:37:31,407: main: DEBUG: Compile trainer 2014-12-09 21:37:32,011: groundhog.trainer.SGD_adadelta: DEBUG: Constructing grad function 2014-12-09 21:37:32,253: groundhog.trainer.SGD_adadelta: DEBUG: Compiling grad function 2014-12-09 21:38:31,497: groundhog.trainer.SGD_adadelta: DEBUG: took 59.2441170216 2014-12-09 21:38:35,031: main: DEBUG: Run training Validation computed every 500 Saving the model... Model saved, took 4.50665020943 Exception in thread Thread-1: Traceback (most recent call last): File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in

_bootstrap_inner self.run() File "~/projects/machine_translation/rnn/GroundHog/groundhog/datasets/TM_dataset.py", line 175, in run target_table = tables.open_file(diter.target_file, 'r', driver=driver) File "/Library/Python/2.7/site-packages/tables/file.py", line 318, in open_file return File(filename, mode, title, root_uep, filters, _kwargs) File "/Library/Python/2.7/site-packages/tables/file.py", line 791, in __init self._g_new(filename, mode, **params) File "tables/hdf5extension.pyx", line 359, in tables.hdf5extension.File._g_new (tables/hdf5extension.c:3875) File "/Library/Python/2.7/site-packages/tables/utils.py", line 157, in check_file_access raise IOError("%s does not exist" % (filename,)) IOError: r does not exist

Am I missing something?

— Reply to this email directly or view it on GitHub https://github.com/lisa-groundhog/GroundHog/issues/24.

DmitryKey commented 9 years ago

@nouiz yes, those exist: they are input files to the training. I have tried absolute paths in all of these, to no avail.

The strange part is in parsing of the values in the dict. Always the first character of the file path is taken and that would not exist as a file of course. I'm puzzled as to whether this is mac's feature or the library issue.

DmitryKey commented 9 years ago

@nouiz on general note, at this point I'm merely interested in training any RNN based translation model. What other choices apart from GroundHog are there that could be used to train one?

rizar commented 9 years ago

There is an on-going effort called blocks. This is a new framework that should replace Groundhog, and a machine translation implementation using it should be expected pretty soon.

DmitryKey commented 9 years ago

@rizar thanks! That is good to know.

DmitryKey commented 9 years ago

Some update:

it was clear, that (could be due to mac os) the input file names are not parsed correctly. And so I decided to rename them to single letters. That seemed to help to save the model correctly, but now I'm getting error messages on cost calculation. I would appreciate, if you could direct me here.


Saving the model...
Model saved, took 5.984333992
2014-12-14 22:29:54,731: groundhog.datasets.TM_dataset: DEBUG: 117724 entries
2014-12-14 22:29:54,731: groundhog.datasets.TM_dataset: DEBUG: Starting from the entry 0
.. iter    0 cost 2226.725 grad_norm 1.50e+02 log2_p_word 1.49e+01 log2_p_expl 4.02e+01 step time  2.182 sec whole time 26.385 sec lr 1.00e+00
Input: UNK UNK <eol>
Target: UNK UNK <eol>
Input:   UNK UNK <eol> 
Output: 
Traceback (most recent call last):
  File "train.py", line 102, in <module>
    main()
  File "train.py", line 99, in main
    main.main()
  File "~/projects/machine_translation/rnn/GroundHog/groundhog/mainLoop.py", line 338, in main
    [fn() for fn in self.hooks]
  File "train.py", line 50, in __call__
    self.model.get_samples(self.state['seqlen'] + 1, self.state['n_samples'], x[:len(x_words)])
  File "~/projects/machine_translation/rnn/GroundHog/groundhog/models/LM_model.py", line 242, in get_samples
    self._get_samples(self, length, temp, *inps)
  File "~/projects/machine_translation/rnn/GroundHog/groundhog/layers/cost_layers.py", line 955, in _get_samples
    print model.word_indxs[values[k]],
KeyError: 21615
Closing remaining open files:c...doneb...done
rizar commented 9 years ago

This is a vocabulary issue: a index of a sampled word is not found in the dictionary. You should check that your settings of the number of words is right (see state) and that you do not confuse dictionaries, the forward one and the inverse one (also see state)

DmitryKey commented 9 years ago

@rizar thanks for getting back on this.

Here is my state:

dict(
    source="c",
    target="b",
    word_indx="vocab.ru.pkl",
    word_indx_trgt="vocab.en.pkl",
    indx_word="ivocab.ru.pkl",
    indx_word_target="ivocab.en.pkl",

    #validFreq=0,
    reload=False
)

I'm running the training with:

python train.py --proto=prototype_encdec_state "prefix='encdec-50_',seqlen=50,sort_k_batches=20" --state state.py

head of vocab.ru.pkl:

{'!': 14,
 ',': 12,
 '-': 42,
 '.': 6,
 '...': 46,
 '</s>': 0,
 '<s>': 0,
 '?': 19,
 'UNK': 1,

head of ivocab.ru.pkl:

{0: '<s>',
 1: 'UNK',
 2: '\xd0\xb2',
 3: '\xd0\xbd\xd0\xb0',
 4: '\xd0\xbd\xd0\xb5',
 5: '\xd0\xbe\xd0\xbd',
 6: '.',
 7: '\xd1\x81',
 8: '\xd1\x8f',
 9: '\xd0\xb8',

head of vocab.en.pkl:

{'!': 29,
 '&apos;': 6,
 ',': 22,
 '.': 11,
 '...': 70,
 '</s>': 0,
 '<s>': 0,
 '?': 37,
 'I': 14,
 'UNK': 1,

head of ivocab.en.pkl:

{0: '<s>',
 1: 'UNK',
 2: 'to',
 3: 'the',
 4: 'a',
 5: 'of',
 6: '&apos;',
 7: 'in',
 8: 's',
 9: 'he',
rizar commented 9 years ago

I guess you need to pay attention to the following settings from state.py:

state['null_sym_source'] = 30000
state['null_sym_target'] = 30000
state['n_sym_source'] = state['null_sym_source'] + 1
state['n_sym_target'] = state['null_sym_target'] + 1

Those should agree with the number of words in your source and target vocabularies.

P.S. I would be very eager to play with a translator to Russian!

DmitryKey commented 9 years ago

@rizar thanks a lot for helping, it now runs smoothly on a 100 entry test dictionary. Time to try it on larger dictionaries. Exciting!

Lotemp commented 7 years ago

Hello @DmitryKey , I'm struggling with this issue. I keep getting this error:

if i2w[seq[k]] == '': KeyError: 2708

with various keys.

The size of the vocabulary is determined by the vocab.pkl files?

Thanks! Lotem

DmitryKey commented 7 years ago

hi @Lotemp I would suggest opening a new issue, as this one has been closed. That way your question might be better noticed.

On the size, yes, check the sizes of the pickle files. I will double check soon.

DmitryKey commented 7 years ago

@Lotemp when using https://github.com/lisa-groundhog/GroundHog/blob/master/experiments/nmt/preprocess/preprocess.py you can pass -v option for controlling the dictionary size.

Lotemp commented 7 years ago

Thanks Dmitry ! I know that GroundHog is not maintained and I'm looking into changing working with Blocks instead, but I don't wish to throw away all of the progress I made with GroundHog, so one more question- is it possible to preform validation with GroundHog? It seems that the iterator for the validation data is not used at all, and regularization is not used or implemented? Thanks in advance for your help Lotem

2016-11-03 8:02 GMT+02:00 Dmitry Kan notifications@github.com:

@Lotemp https://github.com/Lotemp when using https://github.com/lisa- groundhog/GroundHog/blob/master/experiments/nmt/preprocess/preprocess.py you can pass -v option for controlling the dictionary size.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lisa-groundhog/GroundHog/issues/24#issuecomment-258069754, or mute the thread https://github.com/notifications/unsubscribe-auth/AT2mf4EsNOTuR9nsG2tWe2taHUluob1xks5q6XjngaJpZM4DGRXl .

Lotemp commented 7 years ago

Hi lisa-groundhog/Groun,

I'd like to add you to my professional network on LinkedIn.

Confirm that you know Lotem: https://www.linkedin.com/comm/start/accept-invitation?sharedKey=Osn7XTli&invitationId=6200045383129841664&trk=eml-guest-invite-cta&trkEmail=eml-invite_guest-null-2-null-null-0%7Eiv2tq6wm%7Egt

You received an invitation to connect. LinkedIn will use your email address to make suggestions to our members in features like People You May Know. Unsubscribe here: https://www.linkedin.com/e/v2?e=0-iv2tq6wm-gt&t=lun&midToken=AQFmrFVhGEd36Q&ek=invite_guest&loid=AQHGYDqNJHUkpgAAAVgr80T4dukBUWdCLgK_i7bHCoCIdIqmEY3dyYYBre-EjHwEGlWDBbKLLqKrvwam0DgxkWLkFYNLFckWiiLLr4S-rCiIiCD3tNKqQ5jvle0Ytv6vXMxOQdUWwghj4ABMRjJVX3u7y5wq7lZrtRXXDpZ8_SjSwJ_8PA9Ye2zYphUQNnmGAPct&eid=0-iv2tq6wm-gt

This email was sent to reply@reply.github.com.

If you need assistance or have questions, please contact LinkedIn Customer Service: https://www.linkedin.com/e/v2?e=0-iv2tq6wm-gt&a=customerServiceUrl&ek=invite_guest

© 2016 LinkedIn Corporation, 2029 Stierlin Court, Mountain View CA 94043. LinkedIn and the LinkedIn logo are registered trademarks of LinkedIn.

DmitryKey commented 7 years ago

@Lotemp Blocks sounds like a good plan. For the validation -- I didn't dig into this. The only perf metric I pay attention to is the one outputted by the framework during training:

iter 7771 cost 5371.263 grad_norm 1.82e+03 log2_p_word 8.66e+00 log2_p_expl 1.21e+02 step time  3.562 min whole time 15.667 min lr 1.00e+00
iter 7772 cost 6108.257 grad_norm 1.70e+03 log2_p_word 8.75e+00 log2_p_expl 1.38e+02 step time  4.077 min whole time 19.744 min lr 1.00e+00
iter 7773 cost 6365.488 grad_norm 2.10e+03 log2_p_word 8.59e+00 log2_p_expl 1.43e+02 step time  4.266 min whole time 24.011 min lr 1.00e+00
Lotemp commented 7 years ago

Lotem Peled would like to connect on LinkedIn. How would you like to respond?

Accept: https://www.linkedin.com/comm/start/accept-invitation?sharedKey=Osn7XTli&invitationId=6200045383129841664&trk=eml-first_guest_reminder_01-hero-121-accept_text&trkEmail=eml-first_guest_reminder_01-hero-121-accept_text-null-%7Eg6owfx%7Eivar6xko%7E1w

View Lotem Peled's profile: https://www.linkedin.com/comm/start/accept-invitation?sharedKey=Osn7XTli&invitationId=6200045383129841664&trk=eml-first_guest_reminder_01-hero-3-profile_text&trkEmail=eml-first_guest_reminder_01-hero-3-profile_text-null-%7Eg6owfx%7Eivar6xko%7E1w

You received an invitation to connect. LinkedIn will use your email address to make suggestions to our members in features like People You May Know. Unsubscribe here: https://www.linkedin.com/e/v2?e=-g6owfx-ivar6xko-1w&t=lun&midToken=AQFmrFVhGEd36Q&ek=first_guest_reminder_01&li=123&m=unsub&ts=HTML&eid=-g6owfx-ivar6xko-1w&loid=AQG6xoR7QnBpcgAAAVhJcOX-XlX65XLPm2sKsk9eQQKtJwK7WYTWPHnpZ-i6opYuT9QKR3JImbKBdszn5zTbgB9b1Y6Ioqj7DGvv7jG3yq3SKT9MHTH5BAmkLilK3sHCoNfV7Xzge09baLEE9jUWx9S5Ih_Sq_slhHqEETpchshwPsb66Lh38QIlJzgUkvwggu9n

This email was sent to reply@reply.github.com.

If you need assistance or have questions, please contact LinkedIn Customer Service: https://www.linkedin.com/e/v2?e=-g6owfx-ivar6xko-1w&a=customerServiceUrl&ek=first_guest_reminder_01

© 2016 LinkedIn Corporation, 2029 Stierlin Court, Mountain View CA 94043. LinkedIn and the LinkedIn logo are registered trademarks of LinkedIn.

DmitryKey commented 7 years ago

@Lotemp you might want to remove some of the messages above (produced by LinkedIn).

Lotemp commented 7 years ago

Hi ,

Lotem Peled would like to connect on LinkedIn. How would you like to respond?

Accept: https://www.linkedin.com/comm/start/accept-invitation?sharedKey=Osn7XTli&invitationId=6200045383129841664&trk=eml-second_guest_reminder_01-hero-120-accept_text&trkEmail=eml-second_guest_reminder_01-hero-120-accept_text-null-%7Eg6owfx%7Eivkhhzgf%7Eyv

View Lotem Peled's profile: https://www.linkedin.com/comm/start/accept-invitation?sharedKey=Osn7XTli&invitationId=6200045383129841664&trk=eml-second_guest_reminder_01-hero-3-profile_text&trkEmail=eml-second_guest_reminder_01-hero-3-profile_text-null-%7Eg6owfx%7Eivkhhzgf%7Eyv

You received an invitation to connect. LinkedIn will use your email address to make suggestions to our members in features like People You May Know. Unsubscribe here: https://www.linkedin.com/e/v2?e=-g6owfx-ivkhhzgf-yv&t=lun&midToken=AQFmrFVhGEd36Q&ek=second_guest_reminder_01&li=122&m=unsub&ts=HTML&eid=-g6owfx-ivkhhzgf-yv&loid=AQG2zijRlObmrQAAAVhslseqY5jk44z5Hyq1_TTYp0dV4NTW2Y56kwjAk1VdvPwaKQki1P7X0MCXsOuikF0x33U7yxqzZjdUfZJO5Q3COTTS2X2AbbTZPIAAPkebdNnQk-FG2RnRm4vxSzSGBKAGReqCb4SF7f0FdriIgeCp5nKtQZudl6boAktZUN-06JN5qLUu

This email was sent to reply@reply.github.com.

If you need assistance or have questions, please contact LinkedIn Customer Service: https://www.linkedin.com/e/v2?e=-g6owfx-ivkhhzgf-yv&a=customerServiceUrl&ek=second_guest_reminder_01

© 2016 LinkedIn Corporation, 2029 Stierlin Court, Mountain View CA 94043. LinkedIn and the LinkedIn logo are registered trademarks of LinkedIn.