-
## 🐛 Bug
When I try to build PyTorch with `USE_SYSTEM_XNNPACK` I see the following error message:
```
[3036/4187] Linking CXX shared library lib/libtorch_cpu.so
FAILED: lib/libtorch_cpu.so
: &…
-
## Description
I've noticed that TensorRT inference speed, in general, is significantly slower when using `LayerNormalization` instead of `BatchNormalization`. Also in particular, any gains I've se…
-
### Search before asking
- [X] I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and found no similar bug report.
### YOLOv5 Component
_No response_
### Bug
Hi! I…
-
Dear Experts,
I am running ResNet18 through Crypten and trying to follow how the internal is working. However, I am not finding how batchnorm works, as opposed to other layers.
My initial understa…
-
## Bug Description
This is a regression due to :
- torch 2.0.1
- torch-tensorrt 1.4.0
- torchvision 0.15.2
- tensorrt 8.6.1
Now we fail to convert every network with torch_tensorrt.compile(…
-
* Before opening a new issue, we wanted to provide you with some useful suggestions (Click "Preview" above for a better view):
* Consider checking out SDK [examples](https://github.com/IntelRea…
-
I noticed that the sampling stage for a batched input uses for-loop to calculate each item. Is that always the fact? For large batches ,the loop sends many small cuda kernels which is inefficient.
…
-
## 背景
目前,存在几种不同的预训练模型架构:仅实现编码器架构的自动编码模型(例如BERT),仅实现解码器的自回归模型(例如GPT),以及同时实现编码器和解码器的编码器-解码器模型(例如T5)。[GLM](https://arxiv.org/abs/2103.10360)模型与这些模型略有不同。它采用了一种自回归的空白填充方法, 并且在NLP领域三种主要的任务(自然语言理解,无条件生成,有…
-
![image](https://github.com/alihaydaroglu/s2p-lbm/assets/78885453/fd490492-47ec-4fef-9df8-6a414e0d171e)
-
### System information
Type | Version/Name
--- | ---
Distribution Name | Debian
Distribution Version | 10 buster
Linux Kernel | 4.19.0-14 (SMP Debian 4.19.171-2)
Architecture | amd64…
segdy updated
1 month ago