kubeflow / pytorch-operator

PyTorch on Kubernetes
Apache License 2.0
307 stars 143 forks source link

How to run single-machine job? #278

Closed jiaqianjing closed 4 years ago

jiaqianjing commented 4 years ago

I found that the value of master can only be set to 1 . Can I only run distributed job by pytorch-operator ?

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/question 0.77

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
area/front-end 0.55

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
area/front-end 0.55

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

gaocegege commented 4 years ago

No, you can have 1 Master to run local training jobs.

/cc @johnugeorge

jiaqianjing commented 4 years ago

No, you can have 1 Master to run local training jobs.

/cc @johnugeorge

ok, I had got it

gaocegege commented 4 years ago

Feel free to post questions here if there is any problem when you run local training jobs with pytorch-operator.

jiaqianjing commented 4 years ago

Feel free to post questions here if there is any problem when you run local training jobs with pytorch-operator.

3q. By the way, could I have your Wechat?

gaocegege commented 4 years ago

Please tell me your mail address, I will send to you.

jiaqianjing commented 4 years ago

Please tell me your mail address, I will send to you.

jiaqianjing@gmail.com or jiaqianjing1992@163.com

gaocegege commented 4 years ago

sent to jiaqianjing@gmail.com