Closed MarkYangjiayi closed 6 years ago
@bermanmaxim
Hi @MarkYangjiayi, thanks for the interest.
This is understood to be because of Tensorflow's implementation of cumsum
. See this issue for reference (I reopened it for visibility): [issues/6].
A solution would be to interface with nvidia cub's implementation of cumsum
but I haven't had the time to do it, and we have been working with pytorch where cumsum
is fast.
Closing as duplicate.
Hello, I loved your work on Lovasz softmax very much and implemented it on a modified version of Deeplabv3+ in Tensorflow. However, I experienced significant speed drop, the time used per step increased from 0.4s ( using cross entropy) to now almost 3.8s. Is this normal or did I do something wrong? Thankyou!