Open anferico opened 5 months ago
Any help with this? @sanchit-gandhi @muellerzr
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Up @sanchit-gandhi @muellerzr
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Pinging @ylacombe and @eustlb as Sanchit is away for a few months
Hey @eustlb, I've never used DeepSpeed myself, would you like to take a stab at it? If not, I'll try to reproduce the issue on my side
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Gentle ping @eustlb @ylacombe!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
transformers
version: 4.42.3Who can help?
@sanchit-gandhi @muellerzr
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
hubert_mre.py
:hubert_mre.sh
:zero3.json
:Run
hubert_mre.sh
and watch the script hang indefinitely.The curious thing is that this seems to happen only with HuBERT models. If, for example, you replace
HubertModel.from_pretrained("facebook/hubert-large-ls960-ft")
withWav2Vec2BertModel.from_pretrained("facebook/w2v-bert-2.0")
, the script runs just fine.Also, this works fine if you pass
--num_gpus 1
.Expected behavior
The script runs to completion without hanging indefinitely.