kubeflow / pytorch-operator

PyTorch on Kubernetes
Apache License 2.0
306 stars 143 forks source link

Unstructured converted to Pytorch Job Anonymous field error when json uses inline mode #234

Closed leileiwan closed 4 years ago

leileiwan commented 4 years ago

1. Introduction

1.1 Enviroment

2. problem analysis

2.1 the core process of this test case

2.2 Analyzing the code

2.2.1 Important Information

map[kind:PytorchJob metadata:map[creationTimestamp:<nil> name:test-pytorchjob namespace:default] spec:map[block:<nil> cleanPodPolicy:All maxGPU:4 minGPU:4 pytorchReplicaSpecs:map[Master:map[replicas:1 template:map[metadata:map[creationTimestamp:<nil>] spec:map[containers:[map[args:[Fake Fake] image:test-image-for-kubeflow-pytorch-operator:latest name:pytorch ports:[map[containerPort:23456 name:pytorchjob-port]] resources:map[]]]]]] Worker:map[replicas:4 template:map[metadata:map[creationTimestamp:<nil>] spec:map[containers:[map[args:[Fake Fake] image:test-image-for-kubeflow-pytorch-operator:latest name:pytorch ports:[map[containerPort:23456 name:pytorchjob-port]] resources:map[]]]]]]] topologyGPU:<nil>] status:map[conditions:[map[lastTransitionTime:2019-12-14T09:02:53Z lastUpdateTime:2019-12-14T09:02:53Z reason:PytorchJobSucceeded status:True type:Succeeded]] replicaStatuses:<nil>]]

{{PytorchJob } {test-pytorchjob  default    0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] nil [] } {<nil> <nil> 0xc0002bb0d0 <nil> 0xc000608cf8 0xc000608d08 <nil> <nil> map[Master:0xc000139600 Worker:0xc000139b80]} {{[] map[] <nil> <nil> <nil>}  0 0}}

2.2.2 Source Confirmation

3. summary

The main reason is that when Struct anonymouse filed is converted by json, it is inline form by default (does not include common.JobStatus structure information), but golang reflect Struct anonymouse filed contains structure information, so the conversion from Unstructed to PytorchJob cannot be correctly converted.

4. Proposed modification