(Colab) Clear GPU RAM usage after running the generation code without restarting instance

I found that if I want to try changes in the source code, I had to rerun the "Initialize model" cell to make the changes effective. However, if I run "Initialize model" cell then run "Run the model" cell, even stopping the execution of "Run the model" cannot release the GPU memory occupied, which makes it impossible to re-build the model which requires another whole bunch of memory. Restarting the instance would cost considerable time to resetup the environment, is there a way to resolve this without restarting? Thanks!

dvmazur / mixtral-offloading

(Colab) Clear GPU RAM usage after running the generation code without restarting instance #28