Closed terrytangyuan closed 5 years ago
@terrytangyuan like preforming some init tasks before starting the training? This can be done in the master pod, right? For example, now we can ask the master pod to do it locally, or trigger a spark job in a remote cluster to convert. And worker pods will be launched after the initTask
finishes.
Yes exactly
LGTM. Let's sync with others in the next standup.
Currently we need to convert ODPS data to RecordIO files before training starts. We need to generate and save the data in a shared storage where worker pods have access to. An solution would be performing the data conversion in one of the pods and once it's finished we start the training tasks.
cc: @ywskycn