kubeflow / pytorch-operator

PyTorch on Kubernetes
Apache License 2.0
306 stars 143 forks source link

PyTorch Operator recognizes kubernetes cluster like single machine? #267

Closed aheeruru closed 4 years ago

aheeruru commented 4 years ago

Hi.

I tested the DataParallel(https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html) is for single node & mutli gpu on PyTorch Operator. I have 3 nodes on k8s. I expected my master and worker pod deployed on one node of k8s cluster, but It deployed every node.

So I wonder that PyTorch Operator recognizes kubernetes cluster like single machine.

Thanks :)

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
bug 0.53

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

aheeruru commented 4 years ago

sorry. I misunderstand. closed.