AlignmentResearch / tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer
https://tuned-lens.readthedocs.io/en/latest/
MIT License
432 stars 47 forks source link

Docker bug fix #15

Closed levmckinney closed 1 year ago

levmckinney commented 1 year ago

What this pull request addresses

GPU training on the current docker file does not work due to an incompatibility between the pytorch and cuda versions

What I did

In order to use torch=1.13.1 you need to use cuda 11.6the only docker file nvidia provides for this is based on Ubuntu 20.04 which ships with python3.8. So to satisfy all the requirements for this I had to use a separate ppa to install python3.9 and pip. Note that within the docker file you specifically need to run the code using the python3.9 command and not the pyhton3 or python. In addition, pip should always be used as a module so python3.9 -m pip install <something>.

Another solution to this problem is to upgrade to support pytorch 2.0 #14.

AdamGleave commented 1 year ago

There's probably some way to use update-alternatives to make python3.9 the default interpreter although using python3.9 explicitly is OK too.

The Dockerfile LGTM but I have limited context on this project so will defer to @norabelrose on final review.