catalyst-team / catalyst

Accelerated deep learning R&D
https://catalyst-team.com
Apache License 2.0
3.3k stars 388 forks source link

Crashes on 2xT4 GPUs #1433

Closed Philmod closed 1 year ago

Philmod commented 2 years ago

🐛 Bug Report

Catalyst fails on 2xT4 GPUs.

We install Catalyst in the Kaggle base image. This week we wanted to release a new image with upgraded packages. It doesn't look like Catalyst was upgraded, but Accelerate was (from 0.12 to 0.13.1).

How To Reproduce

Steps to reproduce the behavior: Run this unit test on a 2xT4 GPU.

Code sample

https://github.com/Kaggle/docker-python/blob/main/tests/test_catalyst.py

Screenshots

Screen Shot 2022-10-18 at 9 55 46 AM

Expected behavior

The test passes on a P100 GPU.

Environment

https://gist.github.com/Philmod/0349a2cf16d76e8d20e960d750962241

Checklist

FAQ

Please review the FAQ before submitting an issue:

github-actions[bot] commented 2 years ago

Hi! Thank you for your contribution! Please re-check all issue template checklists - unfilled issues would be closed automatically. And do not forget to join our slack for collaboration.

Philmod commented 2 years ago

Update: pinning accelerate package to 0.12.0 fixes the problem.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.