nvdla / hw

RTL, Cmodel, and testbench for NVDLA
Other
1.71k stars 565 forks source link

question for NV_NVDLA_CMAC_CORE_MAC_mul module #231

Closed zhouweiscut closed 5 years ago

zhouweiscut commented 5 years ago

hi, i am puzzled by multiply module in MAC. for example, op_a_data=0x28a2 and op_b_dat=0xbfb4, and the simulaton result res_a=0xadf6fe8, res_b=0x40410000. but actually 0x28a2*0xbfb4=0x1E6D6FE8, how does it calculate out the result res_a and res_b ?

mul
j0210003 commented 5 years ago

hi, how did you get the result with the unknown exp_sft[3:0]? pp_fp16_0_sft and pp_fp16_1_sft, they need it. they are shifted results of CSA tree outputs.

zhouweiscut commented 5 years ago

hi @j0210003, this waveform is dump when run case traces/traceplayer/conv_8x8_fc_int16, exp_sft[3:0] is unknow. when i trace the codes, pp_fp16_0_sft and pp_fp16_1_sft are always 0 if FP16 is not enable. and this case is just for INT16 multiplication.

image

j0210003 commented 5 years ago

ok, @zhouweiscut sorry i was working on fp16 so i was confused.

when i looked into this multiplier in FP16 format, the output is biased by 0x5555_0000 so res_a (shifted) + res_b - 0x55550000 was equal with the real product value i don't know why but you better check it out

zhouweiscut commented 5 years ago

i checked the output result of MAC cell, it is right. because MAC cells use multiplier BOOTH code and CSA, it is hard to check out why the mul output is biased by 0x5555_0000. maybe it is due to synthesize timing issue.