Closed jpzg closed 4 years ago
The transfer from the host to the DPU is done by dpu_copy_to, which cannot use the full memory bandwidth. Instead this should be using prepare_xfer and push_xfer, which interleave copies to different DPUs at the same time.
dpu_copy_to
prepare_xfer
push_xfer
The transfer from the host to the DPU is done by
dpu_copy_to
, which cannot use the full memory bandwidth. Instead this should be usingprepare_xfer
andpush_xfer
, which interleave copies to different DPUs at the same time.