t2t_usr_dir doesn't work

MorinoseiMorizo commented 6 years ago

Dear all,

I want to add my own hyper-parameter settings by using --t2t_usr_dir option as described in https://github.com/tensorflow/tensor2tensor/blob/master/docs/walkthrough.md#adding-your-own-components

To do this, I created a dir on ~/usr/t2t_usr and added following codes:

# In ~/usr/t2t_usr/__init__.py
from . import my_registrations

# In ~/usr/t2t_usr/my_registrations.py

from tensor2tensor.models import transformer
from tensor2tensor.utils import registry

@registry.register_hparams
def transformer_my_very_own_hparams_set():
  hparams = transformer.transformer_base()
  hparams.hidden_size = 1024

Then, I run the following commands to register it: t2t-trainer --t2t_usr_dir=~/usr/t2t_usr --registry-help But I got the following errors:

INFO:tensorflow:Importing user module t2t_usr from path /path/to/usr
Traceback (most recent call last):
  File "/path/to/.pyenv/versions/tensorflow/bin/t2t-trainer", line 191, in <module>
    tf.app.run()
  File "/path/to/.pyenv/versions/anaconda3-5.0.1/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, i
n run
    _sys.exit(main(argv))
  File "/path/to/.pyenv/versions/tensorflow/bin/t2t-trainer", line 182, in main
    hparams = create_hparams()
  File "/path/to/.pyenv/versions/tensorflow/bin/t2t-trainer", line 81, in create_hparams
    return tpu_trainer_lib.create_hparams(FLAGS.hparams_set, FLAGS.hparams)
  File "/path/to/.pyenv/versions/anaconda3-5.0.1/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor/tpu/tpu_trainer_lib.py", line 75,
 in create_hparams
    hparams = registry.hparams(hparams_set)()
  File "/path/to/.pyenv/versions/anaconda3-5.0.1/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor/utils/registry.py", line 171, in
hparams
    display_list_by_prefix(list_hparams(), starting_spaces=4)))
LookupError: HParams set  never registered. Sets registered:
    aligned:
      * aligned_8k
      * aligned_8k_grouped
      * aligned_base
      * aligned_grouped
      * aligned_local
      * aligned_local_1k
      * aligned_local_expert
      * aligned_lsh
      * aligned_memory_efficient
...

Does anyone have any suggestion to fix this?

Thank you in advance.

nimaous commented 6 years ago

also I have the same problem

fstahlberg commented 6 years ago

Not sure if this is the problem here, but afaik you need to return hparams in transformer_my_very_own_hparams_set()

rsepassi commented 6 years ago

Could you verify you're using v1.4.1? Please provide the exact command-line and output.

MorinoseiMorizo commented 6 years ago

Thank you for replying and sorry for the late reply.

By the suggestion from @fstahlberg, I checked my script and found that I forgot to add return hparams. After adding it, I ran the t2t-trainer again, but it doesn't solve the problem.

And I just found that the same error happens if I run it without the t2t_usr_dir option. I re-installed tensor2tensor but it still doesn't work.

@nimaous I'm using 1.4.1 and I used the following commands to re-install tensor2tensor.

$ pip uninstall tensor2tensor
$ pip install tensor2tensor --no-cache-dir

Then I run $ t2t-trainer, I got the following error.

Traceback (most recent call last):
  File "/path/to/.pyenv/versions/tensorflow/bin/t2t-trainer", line 191, in <module>
    tf.app.run()
  File "/path/to/.pyenv/versions/anaconda3-5.0.1/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
    _sys.exit(main(argv))
  File "/path/to/.pyenv/versions/tensorflow/bin/t2t-trainer", line 182, in main
    hparams = create_hparams()
  File "/path/to/.pyenv/versions/tensorflow/bin/t2t-trainer", line 81, in create_hparams
    return tpu_trainer_lib.create_hparams(FLAGS.hparams_set, FLAGS.hparams)
  File "/path/to/.pyenv/versions/anaconda3-5.0.1/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor/tpu/tpu_trainer_lib.py", line 75, in create_hparams
    hparams = registry.hparams(hparams_set)()
  File "/path/to/.pyenv/versions/anaconda3-5.0.1/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor/utils/registry.py", line 171, in hparams
    display_list_by_prefix(list_hparams(), starting_spaces=4)))
LookupError: HParams set  never registered. Sets registered:
    aligned:
      * aligned_8k
      * aligned_8k_grouped
      * aligned_base
      * aligned_grouped
      * aligned_local
      * aligned_local_1k
      * aligned_local_expert
      * aligned_lsh
      * aligned_memory_efficient
      * aligned_moe
      * aligned_no_att
      * aligned_no_timing
      * aligned_pos_emb
      * aligned_pseudolocal
      * aligned_pseudolocal_256
    attention:
      * attention_lm_11k
      * attention_lm_12k
      * attention_lm_16k
      * attention_lm_ae_extended
      * attention_lm_attention_moe_tiny
      * attention_lm_base
      * attention_lm_hybrid_v2
      * attention_lm_moe_24b_diet
      * attention_lm_moe_32b_diet
      * attention_lm_moe_base
      * attention_lm_moe_base_ae
      * attention_lm_moe_base_hybrid
      * attention_lm_moe_base_local
      * attention_lm_moe_base_long_seq
      * attention_lm_moe_base_memeff
      * attention_lm_moe_large
      * attention_lm_moe_large_diet
      * attention_lm_moe_memory_efficient
      * attention_lm_moe_small
      * attention_lm_moe_tiny
      * attention_lm_moe_translation
      * attention_lm_moe_unscramble_base
      * attention_lm_no_moe_small
      * attention_lm_small
      * attention_lm_translation
      * attention_lm_translation_full_attention
      * attention_lm_translation_l12
    basic:
      * basic_1
    bluenet:
      * bluenet_base
      * bluenet_tiny
    bytenet:
      * bytenet_base
    cycle:
      * cycle_gan_small
    gene:
      * gene_expression_conv_base
    lstm:
      * lstm_attention
      * lstm_bahdanau_attention
      * lstm_bahdanau_attention_multi
      * lstm_luong_attention
      * lstm_luong_attention_multi
      * lstm_seq2seq
    multimodel:
      * multimodel_base
      * multimodel_tiny
    neural:
      * neural_gpu
    resnet:
      * resnet_base
    revnet:
      * revnet_base
    shakeshake:
      * shakeshake_cifar10
    slicenet:
      * slicenet_1
      * slicenet_1noam
      * slicenet_1tiny
    super:
      * super_lm_b8k
      * super_lm_base
      * super_lm_big
      * super_lm_conv
      * super_lm_high_mix
      * super_lm_low_mix
    transformer:
      * transformer_ae_base
      * transformer_ae_cifar
      * transformer_ae_small
      * transformer_base
      * transformer_base_single_gpu
      * transformer_base_sketch
      * transformer_base_v1
      * transformer_base_v2
      * transformer_big
      * transformer_big_dr1
      * transformer_big_dr2
      * transformer_big_enfr
      * transformer_big_single_gpu
      * transformer_clean
      * transformer_clean_big
      * transformer_dr0
      * transformer_dr2
      * transformer_ff1024
      * transformer_ff4096
      * transformer_h1
      * transformer_h16
      * transformer_h32
      * transformer_h4
      * transformer_hs1024
      * transformer_hs256
      * transformer_k128
      * transformer_k256
      * transformer_l10
      * transformer_l2
      * transformer_l4
      * transformer_l8
      * transformer_ls0
      * transformer_ls2
      * transformer_moe_12k
      * transformer_moe_8k
      * transformer_moe_base
      * transformer_moe_prepend_8k
      * transformer_n_da
      * transformer_n_da_l10
      * transformer_opt
      * transformer_parameter_attention_a
      * transformer_parameter_attention_b
      * transformer_parsing_base
      * transformer_parsing_big
      * transformer_parsing_ice
      * transformer_prepend
      * transformer_prepend_v1
      * transformer_prepend_v2
      * transformer_relative
      * transformer_relative_big
      * transformer_relative_tiny
      * transformer_revnet_base
      * transformer_revnet_big
      * transformer_sketch
      * transformer_sketch_2layer
      * transformer_sketch_4layer
      * transformer_sketch_6layer
      * transformer_small
      * transformer_small_sketch
      * transformer_small_tpu
      * transformer_tiny
      * transformer_tiny_tpu
      * transformer_tpu
      * transformer_tpu_base_language_model
      * transformer_tpu_with_conv
    vanilla:
      * vanilla_gan
    xception:
      * xception_base
      * xception_tiny
      * xception_tiny_tpu

If you need any additional information, I'm happy to share it.

rsepassi commented 6 years ago

--t2t_usr_dir is now under test with Travis and so it's known to work. Please try to match your setup to the provided example user directory.

MorinoseiMorizo commented 6 years ago

Thank you for the reply. I checked my scripts and commands carefully and I finally found this is completely my fault.

I just missed the option "--registryhelp" as "--registry-help". The difference is just "-" and "".

I changed the command to $ t2t-trainer --t2t_usr_dir=~/usr/t2t_usr --registry_help it works perfect.

Thank you for all the responses and I'm very sorry for taking your time.

tensorflow / tensor2tensor

t2t_usr_dir doesn't work #492