Closed flyaway1217 closed 6 years ago
The problem is that elmo embeddings depend on context, so if the context changes for a word, the sentence also changes. You are right though that you could cache embeddings for the full sentence - we provide that functionality too,via hdf5 files: https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md#writing-contextual-representations-to-disk
Hi, @DeNeutoy
Previously, I do not need to train with a new loss function and what described in https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md#writing-contextual-representations-to-disk is good for me.
However, now, I have new loss functions to tune.
What I want to do is to only update these scalar weights.
The same problem happens.
The default setting of Elmo
class can update the scalar weights but it also has to go through the lsmts which is very slow.
I am wondering is there any method that can cache all the sentences (training sentences is fixed and should be cached) and only update the scalar weights without going through the LSTMs?
Hi, there I am following the tutorial to use the elmo embeddings. The problem is it seems that we have to recompute the embeddings (go through the lstm) for each sentence which is very slow. If I understand correctly, all we need is to update the weight (gamma and s in the equation 1 in the paper.) for embedding of each layers. We don't need to go through the lstms every time because we do not update the lstms.
In the elmo.py I find every time I call
embeddings = elmo(character_ids)
. It will go through all the networks again. I think this is unnecessary. Is there any cache can be done?