2) When running with deep speed strategy, it gives me:
Invalidate trace cache @ step 327: expected module 365, but got module 365, which seems to also slow down the deep speed evaluation. (tried both one or multiple GPUs with the following config and both have the same alert)
🐛 Bug
version 1.3.1
1) Similar issue as: #531 when using the following code:
when running the following code:
I got:
2) When running with deep speed strategy, it gives me:
Invalidate trace cache @ step 327: expected module 365, but got module 365
, which seems to also slow down the deep speed evaluation. (tried both one or multiple GPUs with the following config and both have the same alert)The deep speed config is:
Trainer created via:
trainer.model
is the model containing the metrics aboveExpected behavior
1) Expect the metric to be on cuda:0. 2) No warning alert appears like:
Invalidate trace cache @ step 327: expected module 365, but got module 365
Environment
conda
,pip
, build from source): 1.3.1CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"