Closed molamooo closed 1 year ago
Hi @Jeffery-Song , thank you for your feedback! Our team is in holiday vacation and will reply to you right after it (8th Oct).
Hi @Jeffery-Song, thanks for reporting this bug. It can be reproduced with merlin-tensorflow:22.09
and I am investigating whether it is a container issue or related to recent changes to HPS TF plugin. We will try to fix it in the next release and get back to you then.
Hi @Jeffery-Song , this bug is related to the replica context when using MirroredStrategy
and it is fixed in nvcr.io/nvidia/merlin/merlin-tensorflow:22.11
.
Describe the bug HPS example
hps_pretrained_model_training_demo.ipynb
crashes.To Reproduce Steps to reproduce the behavior:
Expected behavior The pre-trained model can be loaded with HPS & trained.
Screenshots
Environment (please complete the following information):
Additional context This is the only multi-gpu example of hps tensorflow plugin. Is there any detailed guidence of deploying hps tensorflow plugin under multi-gpu environment?