nyu-mll / jiant-v1-legacy

The jiant toolkit for general-purpose text understanding models
MIT License
21 stars 9 forks source link

Expose intermediate repr, attention weight, and other meaningful tensors #910

Open jeswan opened 4 years ago

jeswan commented 4 years ago

Issue by HaokunLiu Sunday Sep 08, 2019 at 18:19 GMT Originally opened as https://github.com/nyu-mll/jiant/issues/910


We may want to save some tensors in the model to a file, for further study. This is needed for some error analysis, and many other analysis methods. Yet there are a few things we need to think through.

Saving either the entire computation graph, or over the full training process, or in some extreme cases, over full dataset, will consume crazy amount of storage. So we need some way to flexibly select parts we are interested with on all the three sides.

If a user can select them in config file, without the need to modify the code, that would the most desirable.

I don't clearly know how should we design it, especially the computation graph part.

jeswan commented 4 years ago

Comment by sleepinyourhat Sunday Sep 08, 2019 at 21:02 GMT


I'm open to good ideas here, but I think the lazy default is simply to take advantage of what makes PyTorch so distinctively useful: It's easy to load a trained model from disk, insert print/save code in the model, then run it.