Steps for a new ELMo representation are not clear

Hi, I would like to create a new model and get the new word representation by the ELMo model. I followed the instructions as in Training a biLM on a new corpus section to get the weights, and then the example in usage_tokens.py, but I got an error (I'm new to tensorflow, so I probably doing something wrong..) the code: `` from bilm import TokenBatcher, BidirectionalLanguageModel, weight_layers, \ dump_token_embeddings

sentences = read_gzip_object('sentences_l={}_r={}.gz'.format(30,16)) options_file = os.path.join('options.json') weight_file = os.path.join('weights', 'weights.hdf5') vocab_file = 'vocab.txt' token_embedding_file = 'elmo_token_embeddings.hdf5'

batcher = TokenBatcher(vocab_file)

sentences_token_ids = tf.placeholder('int32', shape=(None, None))

bilm = BidirectionalLanguageModel( options_file, weight_file, use_character_inputs=False, embedding_weight_file=token_embedding_file )

sentences_embeddings_op = bilm(sentences_token_ids)

elmo_sentences_input = weight_layers('input', sentences_embeddings_op, l2_coef=0.0) elmo_sentences_output = weight_layers('output', sentences_embeddings_op, l2_coef=0.0)

sentences_ids = batcher.batch_sentences(tmp_sentences)

with tf.Session() as sess: sess.run(tf.global_variables_initializer())

elmo_sentences_input_ = sess.run(
    [elmo_sentences_input['weighted_op']],
    feed_dict={sentences_token_ids: sentences_ids}
)

My questions are:

If I want the word representation, do I need the 'input' or the 'output' (or both) in weight layers?
If I have are large number of sentences to get representation for, do I need to chunk the sentences before feeding it to the TokenBatcher (over a loop)?
Is it possible to get a single representation of each word in the vocab?

Thanks!

allenai / bilm-tf

Steps for a new ELMo representation are not clear #190