Great Work! Just some questions. For the gradient part, I did not see the actual implementation for gradient operation, but only register using official implementation. If I missed it, please correct me. In this case, are you using approximation only in forwarding stage? Thanks.
Great Work! Just some questions. For the gradient part, I did not see the actual implementation for gradient operation, but only register using official implementation. If I missed it, please correct me. In this case, are you using approximation only in forwarding stage? Thanks.