AlignmentResearch / tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer
https://tuned-lens.readthedocs.io/en/latest/
MIT License
438 stars 47 forks source link

Make training the tuned lens on multiple GPU's more streamlined #8

Closed levmckinney closed 1 year ago

levmckinney commented 1 year ago

Currently, this requires using torchrun. It would be nice if it didn't.

A good place to start would be PyTorch's distributed data parallel tutorial series

levmckinney commented 1 year ago

This feels pretty low priority I think we will focus on #66 for now.