-
Thanks for your error report and we appreciate it a lot.
#Bug
`RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor…
-
We want to remove the [ReLU operations](https://github.com/sony/model_optimization/blob/8461a41fa8b6b57ce422fdd5f7301e9a9c8a1b20/tutorials/mct_model_garden/models_pytorch/yolov8/yolov8.py#L269-L272) i…
-
In the paper you can see the diagram of Fixup which applies the residual branch, and adds the output to the original input. Finally it applies a ReLU to this sum.
https://i.stack.imgur.com/T67F3.p…
-
Hi,
Thank you for sharing this great repository!
I think that the ReLU in the k/q/v projections is unneeded, or at least, inconsistent with the Transformer paper:
[https://github.com/hanxiao/tf-n…
-
### bug描述 Describe the Bug
在将模型转换到静态图时,需要使用到paddle.jit.to_static() API
参考了 https://www.paddlepaddle.org.cn/inference/v2.6/guides/export_model/paddle_model_export.html#cankaodaima 中的代码:
```
impor…
-
-
For a regression task, I am using a [mid-size CNN](https://github.com/Welthungerhilfe/cgm-ml/blob/main/src/models/CNNDepthMap/CNNDepthMap-height/q3-depthmap-plaincnn-height/src/model.py) consisting of…
-
Given `test.mlir`:
```
#l1_block_sharded = #tt.operand_constraint
func.func @relu(%arg0: tensor) -> tensor {
// CHECK: %[[C:.*]] = "ttnn.empty"[[C:.*]]
%0 = tensor.empty() : tensor
// …
-
In Seg_Ukan, in archs.py, "__all__" is not defined. So whenever we run train.py, it throws an error.
Solution : In archs.py, before Class : KANLayer add a line :
__all__ = ['UKAN', 'D_ConvLayer'…
-
It looks like `Flux.huber_loss` is type unstable when it comes to Zygote autodiff ?
```julia
using Flux, Zygote
import Statistics: mean
function internfunc_nobroad(m, x, y)
modelvals = m(…