-
## Description
I'm deploying meta-llama/Meta-Llama-3.1-70B-Instruct on a SageMaker endpoint:
- latest DJLServing container with Neuron support (0.29)
- ml.inf2.48xlarge instance.
Model downloa…
-
Hi,
it seems to me that currently this library implements a kind of integrate and fire neuron in the SpikingActivation module.
I was wondering if there are plans for modules that implement more co…
-
NeuronXXXModel classes (i.e. NeuronDecoderModel - optimum/neuron/modeling_decoder.py) invoke transformers-neuronx to compile the target model, however these classes don't pass all the supported input …
-
With AWS Neuron SDK 2.19, when exporting a model and saving the compiled artifacts, it is impossible to reload them afterwards if the python path is different.
This basically makes **shared** seria…
-
**INF1 Test Summary:**
* Total Test Passed: 227
* Errors : 9
= 227 passed, 200 skipped, 248 deselected, 47195 warnings, 9 errors in 7509.34s (2:05:09
Test Steps :
- Setup the neuron enviro…
-
### System Info
```shell
The same script works with `Neuron SDK 2.18.0` and `optimum-neuronx v0.0.22`. But with the latest software stack
(aws_neuron_venv_pytorch) [ec2-user@ip-172-31-29-22 text…
-
[The model](https://github.com/ModelDBRepository/136095/) fails with:
```plaintext
-Segmentation violation
-Backtrace:
- /lib/x86_64-linux-gnu/libc.so.6 : ()+0x42520
- %model_dir%/x86_64/libnrn…
-
Hello Author.
1. very good paper! Why is there no code related to the actual operation of the robot ?
2. how to solve the following error ?I have tried to change os.environ["CUDA_VISIBLE_DEVICES"…
-
This is not a bug, but rather a feature request: even when pre-compiled artifacts are available, loading a model on neuron cores can take a very long time.
This seems especially true when loading a…
-
### System Info
```shell
TGI Image: ghcr.io/huggingface/neuronx-tgi:0.0.23
Platform:
- Platform: Linux-5.15.0-1031-aws-x86_64-with-glibc2.35
- Python version: 3.10.12
Python packages:
…