Closed technillogue closed 1 year ago
most recent https://replicate.com/replicate-internal/staging-llama-2-7b/versions/24563f3501393c944fc85bb52414cdc15be0035eb36d962badfae9ef2d14d874 https://replicate.com/replicate-internal/staging-llama-2-70b/versions/92b5437c84d63fb029e071034726859de691c26c099ad4d1a0b4c95353dd904f
needs to be tested some more with recently-completed trainings that might not be synced to coreweave, and maybe whatever was going on with storagebouncer timeouts, but otherwise should be good to go in
also needs to be timed for 70b downloads
pget is bad for small files. this tries to reuse connections across different downloads and not touch disk, changing exllama to accept io.BytesIO instead of paths and unzipping in memory.
the retry logic and error reporting is not quite perfect