aws-neuron transformers-neuronx issues

aws-neuron / transformers-neuronx

Apache License 2.0

100 stars 29 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Support for Qwen2 (Llama2 based)

#101 bevhanno opened 1 month ago
0
Support for Qwen2 (Llama based)

#100 bevhanno closed 1 month ago
2
Neuron model NEFFs are dependent on the python path

#99 dacorvo opened 1 month ago
2
Sync internal repo to external September 16 2024

#98 jluntamazon closed 2 months ago
0
Sync internal repo to external September 16 2024

#97 jluntamazon closed 2 months ago
0
[Question] BasicTransformerBlock

#96 JH-ninjatech closed 4 months ago
0
Sdk219 embeding

#95 cszhz opened 4 months ago
1
Gibberish output for princeton-nlp/Sheared-LLaMA-1.3B with continuous batching

#94 pinak-p opened 4 months ago
2
add starcoder2

#93 reymondzzzz opened 4 months ago
0
Not able to load llama 3 70b on inf2.24xlarge instance

#92 sangraamp opened 4 months ago
6
Neuron model NEFFs are dependent on the python path

#91 dacorvo closed 2 months ago
4
Sync internal repo to external June 28 2024

#90 hannanjgaws closed 4 months ago
0
Any plan to support Qwen-2 Model

#89 mynewstart opened 5 months ago
1
llava support

#88 sonic182 opened 5 months ago
4
Add Gemma

#87 yisi-wang-slalom opened 6 months ago
0
For Mistral 7B - Generate Text using Input Embeddings + Add no_repeat_ngram_size Support

#86 davidshtian opened 7 months ago
0
Sync internal repo to external Apr 15 2024

#85 hannanjgaws closed 7 months ago
0
Latest changes introduced for continuous batching break Mixtral model

#84 dacorvo opened 7 months ago
5
Add support for Baichuan-13B model

#83 cszhz opened 7 months ago
0
Add support for `gemma` models

#82 benglewis opened 7 months ago
1
Sync internal repo to external Mar 29 2024

#81 hannanjgaws closed 7 months ago
0
Improve Neuron model loading time

#80 dacorvo opened 8 months ago
4
NaN outputs when masking llama model inputs

#79 dacorvo closed 4 months ago
8
Backward compatibility with saved llama 2 compiled artifacts

#78 dacorvo opened 10 months ago
1
Issue while compiling Mistral 7B 0.2 Instruct

#77 josete89 closed 8 months ago
5
User feedback when compiling and reloading a large model

#76 dacorvo opened 10 months ago
1
`stopping_criteria_list(input_ids, probs)` does not check for the correct sequence.

#75 michaelfeil closed 9 months ago
4
Support for MPT model

#74 klutzDrawers opened 10 months ago
1
Infering logits from `model.forward` for the entire batch instead of the last forward's output.

#73 michaelfeil opened 10 months ago
6
Generate Llama 2 from Embeddings

#72 liechtym opened 10 months ago
5
Mixtral config issue -- not handling null well

#71 jimburtoft closed 7 months ago
8
How to use generate() with inputs_embeds

#70 liechtym closed 11 months ago
2
Sync internal repo to external Dec 28 2023

#69 hannanjgaws closed 10 months ago
0
Skipping generation for useless tokens, and modiying cacheids

#68 enochlev closed 10 months ago
3
Inf2 Modified Llama 2 Loading Issue

#67 liechtym closed 10 months ago
11
Vicuna13B model support

#66 petrovicu opened 11 months ago
1
Mixtral Model support

#65 enochlev closed 11 months ago
2
llama-2/codellama benchmark for inf2.xlarge

#64 zliendo closed 11 months ago
4
Llama2 inference overhead time way too long

#63 enochlev closed 11 months ago
6
Added safetensors support in from_pretrained()

#62 dennj opened 1 year ago
0
LLaMA fails when the input token length is over 1790 tokens

#61 dennj closed 8 months ago
6
from_pretrained is broken after transformers made safetensor serialization default

#60 dennj closed 1 year ago
1
Compilation error on llama 7 B with batch size 8

#59 dacorvo closed 6 months ago
4
Can't save/serialize any models except GPT2

#58 awskila closed 4 months ago
4
Avoid splitting Hugging Face Hub checkpoint files on disk

#57 dacorvo closed 6 months ago
7
Turn off safe_serialization from save_split so that save_function is called

#56 jitto opened 1 year ago
0
save_split seems to be broken after transformers made safetensor serialization default

#55 jitto closed 9 months ago
3
Sync internal repo to external Oct 27 2023

#54 hannanjgaws closed 11 months ago
0
About loading and saving llama model of pretraining job

#53 etsurin closed 12 months ago
2
Serving Throughput Optimizations (e.g. PagedAttention)

#52 vigneshv59 closed 4 months ago
3