-
### Motivation.
`vLLM` has already been adapted to many hardware devices, such as `GPU`, `TPU`, and `XPU`. However, adapting these backends requires implementing separate `Worker/Executor/Model Run…
-
Hello developers,I have noticed that you already support QNN backend,what a excellent work!
Do you have any plan to support MTK APU via its NeuroPilot SDK?
-
When I try to use eager debug mode, I receive the following error:
```
WARNING:torch_neuron:Eager debug mode is enabled. In this mode all operations would be executed eagerly. This will result in …
gbpdt updated
1 month ago
-
Please add Qwen/Qwen2-7B model to the neuron cache
-
### What happened + What you expected to happen
### What Happened:
When deploying models using RayServe with autoscaling enabled on Amazon EKS, specifically across multiple `inf2` nodes, the syste…
-
[The model](https://github.com/ModelDBRepository/136095/) fails with:
```plaintext
-Segmentation violation
-Backtrace:
- /lib/x86_64-linux-gnu/libc.so.6 : ()+0x42520
- %model_dir%/x86_64/libnrn…
-
该函数的引用方式:
from paddleslim.nas.ofa.utils import nlp_utils
该函数的原文:
def compute_neuron_head_importance(task_name,
model,
data_lo…
-
### System Info
```shell
Using TGI v0.0.24 to deploy the model on SageMaker
```
### Who can help?
@dacorvo
### Information
- [ ] The official example scripts
- [X] My own modified scripts
### T…
-
**Describe the bug**
While running some experiments together with @TeEsTeBe, we found the implementation of the `aeif_psc_alpha` neuron model appears to be numerically unstable for certain RNG seeds,…
-
### System Info
```shell
AWS EC2 instance: trn1.32xlarge
OS: Ubuntu 22.04.4 LTS
Platform:
- Platform: Linux-6.5.0-1023-aws-x86_64-with-glibc2.35
- Python version: 3.10.12
Python packages:
…