VoVAllen / tf-dlpack

DLPack for Tensorflow
Apache License 2.0
36 stars 4 forks source link

[WIP] attempt zerocopy #2

Closed jermainewang closed 5 years ago

jermainewang commented 5 years ago

@VoVAllen I tried this way of zerocopy implementation. Look good on my side. Basically replace set_output_buf with set_output and pass a Tensor object. I checked its implementation here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_kernel.cc#L870. set_output will decide whether the provided Tensor could be forwarded so to avoid a copy. I print out the log setting TF_CPP_MIN_VLOG_LEVEL and didn't find it performs a physical copy (here). Would you help check it by looking at the memory consumption as well?

VoVAllen commented 5 years ago

Found another similar case. https://github.com/tensorflow/tensorflow/blob/818993c7751601527d662d2417f220e4e856e4ef/tensorflow/core/kernels/immutable_constant_op.cc#L7

Seems this is zero copy