Closed luchangli03 closed 3 years ago
@JackCaoG I have also met a problem that upsample canโt be lowered when I try to train U-GAT-IT using XLA-GPU backend. Do you know where the custom call of upsample for TPU is implemented and do you have some tips about how to implement the custom call of upsample for GPU backend? Thank you very much.
I implemented the custom call EmitPadToStatic
and EmitSliceToDynamic
for gpu, you can check my cl https://github.com/tensorflow/tensorflow/commit/64a5248407af3532c5eb282414fe95b62ee3bfec. The code currently lives in here
I am working on adaptive_max_pool2d
lowering, but we can only support the input_size % output_size == 0
case. The reason is that pt/xla use xla::maxpool
to implement the max_pool
which requires fixed kernel
and stride
. For a input size(using 1d as example) of [10] and output size of [4], pytorch will max pool from windows [0, 2], [2,4], [5,7], [7,9], where stride is not constant.
๐ Feature
Support adaptive_max_pool2d
Motivation
Currently adaptive_avg_pool2d is supported, but the adaptive_max_pool2d is not. When I try to train U-GAT-IT (https://github.com/znxlwm/UGATIT-pytorch), the module is lowered to multiple graph and thus result in bad performance due to this problem.