issues
search
huggingface
/
optimum-nvidia
Apache License 2.0
844
stars
83
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Bump the dev version
#94
mfuntowicz
closed
4 months ago
0
batched generation
#93
Jack000
closed
4 months ago
1
Make float8 quantization back in the game.
#92
mfuntowicz
closed
4 months ago
0
Original model configuration (config.json) was not found error during running inference using "Llama-2-7b-chat-hf"
#91
raorajendra
opened
4 months ago
0
Incorrect tensorrt_llm config class initialization
#90
Wojx
opened
4 months ago
0
Attempt to make CI overall looking 🟢
#89
mfuntowicz
closed
4 months ago
0
update docker image
#88
yongjer
closed
3 months ago
1
Ad ability to save local prebuilt engines
#87
mfuntowicz
closed
4 months ago
0
fatal: Fetched in submodule path 'third-party/tensorrt-llm', but it did not contain 37aa4499520eea1d6c6dbc04a66d77bef00014f5. Direct fetching of that commit failed.
#86
hellostronger
closed
3 months ago
1
Fix hardcoded embedding scale with value from config
#85
mfuntowicz
closed
4 months ago
0
Qwen Support
#84
Yuchen-Cao
opened
4 months ago
0
Make overall `optimum-nvidia` pip installable
#83
mfuntowicz
closed
4 months ago
0
Bring back CI to a normal state
#82
mfuntowicz
closed
4 months ago
0
Bring back tests to align with new workflow
#81
mfuntowicz
closed
4 months ago
0
Fix repo code quality
#80
mfuntowicz
closed
4 months ago
0
Make pipelines compatible with the new workflow
#79
mfuntowicz
closed
4 months ago
0
Update license
#78
mfuntowicz
closed
4 months ago
0
Fix gemma 7b
#77
mfuntowicz
closed
4 months ago
0
How to install optimum-nvidia properly without building a docker image
#76
Yuchen-Cao
closed
4 months ago
0
How to build this environment without docker?
#75
lemon-little
opened
5 months ago
1
Refactoring of the overall structure to better align with the new TRTLLM workflow moving forward
#74
mfuntowicz
closed
4 months ago
0
Build from source fails completely on WSL2 [Ubuntu-20.04]
#73
OPPEYRADY
closed
5 months ago
0
How do you use the library in your scripts after pulling and running the Docker image?
#72
jddunn
opened
5 months ago
1
llama.py with fp8 is broken (inference produces garbage results)
#71
urimerhav
opened
5 months ago
3
Ability to build Whisper encoder/decoder TRT engine
#70
fxmarty
closed
5 months ago
1
Triton Inference Server
#69
TheCodeWrangler
opened
6 months ago
2
Error when Running LLAMA with tensor parallelism = 2
#68
TheCodeWrangler
opened
6 months ago
1
Mixtral support
#67
nmiletic
opened
6 months ago
4
Fixed Repetition Penalty default value
#66
leopra
closed
4 months ago
3
Bump TRTLLM to latest version #d879430
#65
mfuntowicz
closed
6 months ago
0
Build failed with cuda runtime error.
#64
Anindyadeep
opened
6 months ago
1
Bug fixes in readme.
#63
Anindyadeep
closed
6 months ago
1
Pip Installation
#62
rmccorm4
opened
6 months ago
4
Not able to run 'Generate' from QuickStart section
#61
harikrishnaapc
opened
6 months ago
0
Feature request: streamer
#60
RomanKoshkin
opened
6 months ago
0
Installation errors
#59
RomanKoshkin
opened
6 months ago
0
RuntimeError: TRT Engine build failed...
#58
yirunwang
opened
6 months ago
1
model.generate returns strange output shape
#57
Quang-elec44
closed
5 months ago
2
ImportError: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install accelerate`
#56
taozhang9527
opened
7 months ago
0
Disable RMSNorm plugin as deprecated for performance reasons
#55
mfuntowicz
closed
7 months ago
0
Rename LLamaForCausalLM to LlamaForCausalLM to match transformers
#54
mfuntowicz
closed
7 months ago
0
Bump version to 0.1.0b2
#53
mfuntowicz
closed
7 months ago
0
Add more unittest
#52
mfuntowicz
closed
7 months ago
0
Enable HF Transfer in tests
#51
mfuntowicz
closed
7 months ago
0
Use FP8 by default when on a supported device
#50
laikhtewari
opened
7 months ago
0
Issue with llama.py Script in Docker - Process Stalling at Iteration 512
#49
H04K
closed
7 months ago
1
Let's make sure to use the repeated heads tensor when in a non-mha scenario
#48
mfuntowicz
closed
7 months ago
0
FileNotFoundError: [Errno 2] No such file or directory: '/data/Dilip/models/llama-2-7b-chat-hf/build.json'
#47
dilip467
opened
7 months ago
8
Use the new runtime handled allocation
#46
mfuntowicz
closed
7 months ago
0
Enable testing on GPUs
#45
mfuntowicz
closed
7 months ago
0
Previous
Next