john-hewitt / structural-probes

Codebase for testing whether hidden states of neural networks encode discrete structures.
Other
381 stars 77 forks source link

Enable easy swapping of PyTorch models #2

Open john-hewitt opened 5 years ago

john-hewitt commented 5 years ago

Right now, to test a new representation learner, one must:

  1. Use the representation learner to write hidden state vectors for each token (or subword) to disk. (Better idea for subword models: decide how to combine subword representations; write resultant token embeddings to disk)
  2. Run structural probe code on the hidden states as saved to disk.

This is "nice" in that the hidden states don't need to be computed at each pass (BERT is big/slow; I actually run most experiments on CPUs because the probe training is so fast and CPUs are so plentiful)

However, it's "not nice" that one can't swap representation model parameters on the fly, and especially that big huge vectors take up a lot of disk space (115GB for BERT-large on PTB WSJ train -- 40k sents.)

We'd like to enable easy swapping of new models by defining a new class in model.py. We'll need to read in the tokenizer (and perhaps subword-tokenizer) so we pass the model words as identified by its own vocabulary, and are able to map from subword reprs back to token reprs. There's also the problem of inefficiency of aligning subword reprs to token reprs at every batch