lfwa / carbontracker

Track and predict the energy consumption and carbon footprint of training deep learning models.
MIT License
352 stars 26 forks source link

Overly restrictive exits for devices that are not supported #44

Closed LukasHedegaard closed 10 months ago

LukasHedegaard commented 2 years ago

Currently, the lack of device support (e.g. when attempting to use carbon tracker on a non-gpu laptop) results in calls to sys.exit from a separate thread.

While carbon tracker itself is a nice addition to your generic training loop, the above behaviour breaks the training loop on non-supported devices. Try/except blocks do not help here.

It would improve the tool usability (and impact) greatly if a different error-handling strategy could be found, which doesn't break training loops.

lfwa commented 2 years ago

Hi Lukas,

Thanks for contributing to carbontracker.

Have you tried setting the argument ignore_errors=true when instantiating the CarbonTracker class? It is intended to ignore these kind of errors and continue your training loop even when something fails.

PedramBakh commented 10 months ago

To address the concern you've raised, we've made provisions in CarbonTracker where setting the ignore_errors=true argument during instantiation should bypass such errors and allow the training loop to continue uninterrupted. I understand that the current behavior may not be ideal, and we are continually working to make the tool more robust and adaptable.

Given the solution provided, I'll be closing this issue for now. However, if you still encounter challenges or have further insights to share, please don't hesitate to reopen this issue or start a new one. We appreciate your engagement and contribution to the project.