dvmazur / mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops
MIT License
2.28k stars 223 forks source link

Session crashed on colab #7

Closed bitsnaps closed 6 months ago

bitsnaps commented 6 months ago

Hi,

Have you guys managed to make it works on T4 colab?

P.S. It crashes multiple times even with offload_per_layer = 5 as mentioned in the comment.

image
dvmazur commented 6 months ago

Hey! Just tried running the notebook (in the offload_per_layer = 5 setting) and everything works for me. Have you tinkered with the original notebook in any way? If not, try restarting the session and running it again starting with the model initialization cell.

lavawolfiee commented 6 months ago

Hey!

Have you managed to solve this issue? If no, can you please provide some more information:

  1. On which cell exactly did the notebook crush with offload_per_layer = 4?
  2. How many of RAM and GPU VRAM did you have in Google Colab?

Note that our demo notebook should run normally in Google Colab with offload_per_layer = 4, but will crush sometimes with offload_per_layer = 5. The latter option is made for local run with low VRAM.

RDvibe commented 6 months ago

hqq_aten package not installed. HQQBackend.ATEN backend will not work unless you install the hqq_aten lib in hqq/kernels.

dvmazur commented 6 months ago

hqq_aten package not installed. HQQBackend.ATEN backend will not work unless you install the hqq_aten lib in hqq/kernels.

Hqq_aten is not required as we have custom triton kernels for GEMV.