googlecolab / colabtools

Python libraries for Google Colaboratory
Apache License 2.0
2.17k stars 703 forks source link

Files Disappear When Still Running #1045

Open chuktuk opened 4 years ago

chuktuk commented 4 years ago

Bug report for Colab: http://colab.research.google.com/.

For questions about colab usage, please use stackoverflow.

When training is complete, a file containing the best model is output. I saw this file created originally during model training, however, after a few hours, in the file area it only says 'Connecting to a runtime to enable file browsing.' It will be very annoying and a waste of money if I can't retrieve the best model file after all this time.

chuktuk commented 4 years ago

When this happens, the button that usually has the RAM and CPU usage bars just says 'Busy'.

tnovikoff commented 4 years ago

It sounds like you may be running into the resource limits (max vm lifetimes or idle timeouts) of Colab Pro.

Information about resource limits in Colab Pro can be found in the Colab Pro FAQ at http://colab.research.google.com/signup

Information about resource limits in Colab in general can be found in the main Colab FAQ at https://research.google.com/colaboratory/faq.html

Information for getting the most out of Colab Pro can be found at https://colab.research.google.com/notebooks/pro.ipynb

One specific approach that might work in your situation is saving your model file out to Google Drive while the connection is active.

More generally, this might be a good question for StackOverflow, where other users may be able to provide other ideas for what to do in this situation.

Best of luck!

chuktuk commented 4 years ago

I couldn't find anything on stack overflow regarding this type of issue. I found this link on Medium https://medium.com/@ml_kid/how-to-save-our-model-to-google-drive-and-reuse-it-2c1028058cb2 about saving in Google Drive, however, it doesn't explain HOW to save after every epoch. I'm using Flair on top of PyTorch, and it is creating the file automatically. I also don't see how I could start in the middle if it crashed after an epoch. Unless I find a solution to this, it looks like the paid version doesn't have enough resources for my needs.

tnovikoff commented 4 years ago

You could try asking your question on StackOverflow. :-) Best of luck either way.

dexter2406 commented 3 years ago

When this happens, the button that usually has the RAM and CPU usage bars just says 'Busy'.

This happens to me too, do you have any solution now?

chuktuk commented 3 years ago

I ended up having to start the model very early and clicking in various cells from time to time until the model finished training. I then immediately saved the model files to generate my predictions. I tried a javascript function to click cells, but that didn't seem to work. The only solution I found was to keep the window active.

Maybe AWS is a better option, since you can pay for time instead of doing a subscription like the paid Colab version I used.

Best of luck.

On Thu, May 6, 2021 at 7:05 PM dexter2406 @.***> wrote:

When this happens, the button that usually has the RAM and CPU usage bars just says 'Busy'.

This happens to me too, do you have any solution now?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/googlecolab/colabtools/issues/1045#issuecomment-833932623, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK47M6JCT5TI2IMPOWFMFQDTMMODHANCNFSM4LBVMKQQ .

-- Charles R. Tucker Sr. Data Analyst Data Management Santee Cooper

stmer1 commented 3 years ago

I had this problem too, though for me, I found that if I hit Runtime>Interrupt execution, everything magically reappeared. I did not try restarting, since I decided it had probably trained enough, and I wanted to make sure to grab what it had done so far. Not the ideal situation, but at least I did not lose everything.

Good luck.