Closed Shreyas-NR closed 2 years ago
Hi, I have a compiled model that has 1 DPU subgraph taking 1 input tensor and 3 output tensors.
inputTensor.name = A2J_model__A2J_model_ResNetBackBone_Backbone__input_1_swim_transpose_0_fix inputTensor.dims = [1, 288, 288, 3] inputTensor.dtype = xint8 outputTensor.name = A2J_model__A2J_model_ClassificationModel_classificationModel__Conv2d_output__11586_fix outputTensor.dims = [1, 18, 18, 240] outputTensor.dtype = xint8 outputTensor.name = A2J_model__A2J_model_DepthRegressionModel_DepthRegressionModel__Conv2d_output__11920_fix outputTensor.dims = [1, 18, 18, 240] outputTensor.dtype = xint8 outputTensor.name = A2J_model__A2J_model_RegressionModel_regressionModel__Conv2d_output__11749_fix outputTensor.dims = [1, 18, 18, 480] outputTensor.dtype = xint8
I referred to the below examples to write my application code
https://github.com/Xilinx/Vitis-AI-Tutorials/blob/1.4/Design_Tutorials/11-tf2_var_autoenc/files/application/app_mt.py
https://support.xilinx.com/s/article/Multiple-output-model-example-design-from-training-to-application-on-ZCU102?language=en_US
For every different input, I'm unable to get the updated tensor values at the 3 outputs.
This is my dpu runner code snippet,
global out_q1, out_q2, out_q3 out_q1 = [] * n_of_images out_q2 = [] * n_of_images out_q3 = [] * n_of_images ----------------------------------------------------------------------------------------------------------------------------------------- def runThread(id, start, dpu_runner, img): ''' Thread worker function ''' # Set up encoder DPU runner buffers & I/O mapping dictionary global a2j_dict, inbuffer, outbuffer a2j_dict, inbuffer, outbuffer = init_dpu_runner(dpu_runner) # batchsize batchSize = a2j_dict['A2J_model__A2J_model_ResNetBackBone_Backbone__input_1_swim_transpose_0_fix'].shape[0] # set runSize n_of_images = len(img) count = 0 write_index = start # loop over image list while count < n_of_images: if (count+batchSize<=n_of_images): runSize = batchSize else: runSize=n_of_images-count ''' initialise input and execute DPU runner ''' # init input image to input buffer for j in range(runSize): imageRun = a2j_dict['A2J_model__A2J_model_ResNetBackBone_Backbone__input_1_swim_transpose_0_fix'] imageRun[j, ...] = img[(count + j) % n_of_images].reshape(tuple(a2j_dict['A2J_model__A2J_model_ResNetBackBone_Backbone__input_1_swim_transpose_0_fix'].shape[1:])) execute_async(dpu_runner, a2j_dict) # write results to global predictions buffer out_q1.append(a2j_dict['A2J_model__A2J_model_ClassificationModel_classificationModel__Conv2d_output__11586_fix']) out_q2.append(a2j_dict['A2J_model__A2J_model_DepthRegressionModel_DepthRegressionModel__Conv2d_output__11920_fix']) out_q3.append(a2j_dict['A2J_model__A2J_model_RegressionModel_regressionModel__Conv2d_output__11749_fix']) count = count + runSize print("Done with the DPU runner") ----------------------------------------------------------------------------------------------------------------------------------------- def init_dpu_runner(dpu_runner): ''' Setup DPU runner in/out buffers and dictionary ''' io_dict = {} inbuffer = [] outbuffer = [] # create input buffer, one member for each DPU runner input # add inputs to dictionary dpu_inputs = dpu_runner.get_input_tensors() i=0 for dpu_input in dpu_inputs: #print('DPU runner input:',dpu_input.name,' Shape:',dpu_input.dims) inbuffer.append(np.empty(dpu_input.dims, dtype=np.float32, order="C")) io_dict[dpu_input.name] = inbuffer[i] i += 1 # create output buffer, one member for each DPU runner output # add outputs to dictionary dpu_outputs = dpu_runner.get_output_tensors() i=0 for dpu_output in dpu_outputs: #print('DPU runner output:',dpu_output.name,' Shape:',dpu_output.dims) outbuffer.append(np.empty(dpu_output.dims, dtype=np.float32, order="C")) io_dict[dpu_output.name] = outbuffer[i] i += 1 return io_dict, inbuffer, outbuffer ----------------------------------------------------------------------------------------------------------------------------------------- def execute_async(dpu, tensor_buffers_dict): input_tensor_buffers = [tensor_buffers_dict[t.name] for t in dpu.get_input_tensors()] output_tensor_buffers = [tensor_buffers_dict[t.name] for t in dpu.get_output_tensors()] jid = dpu.execute_async(input_tensor_buffers, output_tensor_buffers) return dpu.wait(jid) -----------------------------------------------------------------------------------------------------------------------------------------
Can anyone help me to get past this stage? I'm stuck here for many days.
Thank you,
Issue resolved, my input list was getting overwritten, whenever the append was called.
Thankyou.
Hi, I have a compiled model that has 1 DPU subgraph taking 1 input tensor and 3 output tensors.
I referred to the below examples to write my application code
https://github.com/Xilinx/Vitis-AI-Tutorials/blob/1.4/Design_Tutorials/11-tf2_var_autoenc/files/application/app_mt.py
https://support.xilinx.com/s/article/Multiple-output-model-example-design-from-training-to-application-on-ZCU102?language=en_US
For every different input, I'm unable to get the updated tensor values at the 3 outputs.
This is my dpu runner code snippet,
Can anyone help me to get past this stage? I'm stuck here for many days.
Thank you,