Closed Sheshuk closed 2 years ago
Hm, caching doesn't seem to work. Also looks like the most time is spent on downloading our models
Thanks for looking into this!
Unfortunately, comparing the run time of the unit tests (tests.yml
) for the first commit in this PR (before; only “Installing dependencies”) with the two most recent ones (after (1) and after (2); “Caching requirements” + “Installing dependencies”), I don’t see an improvement:
Python version | before [s] | after (1) [s] | after (2) [s] |
---|---|---|---|
3.7 | 22 | 2 + 20 | 2 + 19 |
3.8 | 20 | 2 + 20 | 1 + 22 |
3.9 | 26 | 2 + 24 | 2 + 30 |
While we can see from the logs that many (but not all?) dependencies are cached in the “after” cases and didn’t need to be downloaded (which they were “before”), the overall run time has actually increased. Even if this is just due to random variation, it seems that most of the time is actually needed to install the dependencies, not to download them. Is it possible to cache the installed wheels? 🤔
As far as I understood, it should be possible to cache any directory.
But it didn't work for me - I tried to get the GLoBES & SNOwGLoBES to cache and restore from the cached directory, but it keeps creating a new cache every time. Also skipping the install dependencies
step in case of cache hit didn't work for some reason.
Overall, I think we should try out the Docker container variant. That way we can set all the needed dependencies there, and also optionally have all the models preinstalled there. I'm not sure what will be faster though - to download the models from github or to download entire container with models.
Closes #137