janhq cortex.tensorrt-llm issues

janhq / cortex.tensorrt-llm

Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.

https://cortex.jan.ai/docs/cortex-tensorrt-llm

Apache License 2.0

40 stars 2 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Ci: package cuda dependencies

#73 hiento09 closed 1 month ago
0
Redirect log

#72 nguyenhoangthuan99 closed 2 months ago
0
rebase: v0.11.0

#71 vansangpfiev closed 3 months ago
0
fix: build linux

#70 vansangpfiev closed 3 months ago
0
Add Dockerfile for runner windows

#69 hiento09 closed 3 months ago
0
fix: deprecate gptsession

#68 vansangpfiev opened 3 months ago
0
[Snyk] Fix for 18 vulnerabilities

#67 jan-service-account closed 3 months ago
0
[Snyk] Fix for 8 vulnerabilities

#66 jan-service-account closed 3 months ago
0
fix: better error handling

#65 vansangpfiev closed 3 months ago
0
fix: pack dependencies

#64 vansangpfiev closed 3 months ago
0
[Snyk] Fix for 13 vulnerabilities

#63 jan-service-account closed 3 months ago
0
test: custom wheel

#62 vansangpfiev closed 3 months ago
0
feat: tiktoken integration

#60 nguyenhoangthuan99 closed 3 months ago
0
[Snyk] Security upgrade setuptools from 40.5.0 to 70.0.0

#59 jan-service-account opened 4 months ago
0
[Snyk] Security upgrade setuptools from 40.5.0 to 70.0.0

#58 jan-service-account opened 4 months ago
0
[Snyk] Security upgrade setuptools from 40.5.0 to 70.0.0

#57 hiro-v closed 3 months ago
0
fix: synk report fixup

#56 vansangpfiev closed 4 months ago
0
feat: tensorrt-llm-engine README.md file

#55 irfanpena opened 4 months ago
0
feat: TensorRT-LLM Support for logits_prob

#54 hiro-v opened 4 months ago
0
fix: pre package windows

#53 vansangpfiev closed 4 months ago
0
fix: support Mistral v0.3

#52 vansangpfiev closed 4 months ago
0
feat: use batch-manager instead of gpt-runtime

#51 vansangpfiev closed 2 months ago
1
bug: templating issue with mistral v0.3

#50 vansangpfiev closed 4 months ago
4
feat: support llama3

#49 vansangpfiev closed 3 months ago
0
feat: add load model start_time_

#48 vansangpfiev closed 4 months ago
0
Fix ccache linux

#47 hiento09 closed 4 months ago
0
Fix makefile linux

#46 hiento09 closed 4 months ago
0
Sync 0.10.0 from remote

#45 hiento09 closed 3 months ago
0
Rel 0.10.0

#44 hiento09 closed 4 months ago
0
Chore windows build use cuda 12 3

#43 hiento09 closed 4 months ago
0
correct trigger condition

#42 hiento09 closed 5 months ago
0
Chore windows build use python

#41 hiento09 closed 5 months ago
0
feat: Build cortex.tensorrt-llm on Windows

#40 CameronNg closed 5 months ago
0
Makefile and CICD for cpp tensorrt-llm

#39 hiento09 closed 5 months ago
0
Add cortex.tensorrt-llm

#38 hiento09 closed 5 months ago
0
refactor: Refactor nitro with `cortext.tensorrtllm` engine

#37 CameronNg closed 5 months ago
0
Revert "feat: Init code for cortex.tensorrtllm"

#36 CameronNg closed 5 months ago
0
feat: Server example for `cortext.tensorrtllm`

#35 CameronNg closed 5 months ago
0
feat: Init code for cortex.tensorrtllm

#34 CameronNg closed 6 months ago
0
feat: TensorRT-LLM load multiple models

#33 tikikun opened 8 months ago
1
feat: TensorRT-LLM Unload Model

#32 tikikun opened 8 months ago
1
feat: TensorRT-LLM Request Interruption

#31 tikikun opened 8 months ago
1
feat: TensorRT-LLM InferenceRequest and stop_words_list

#30 tikikun opened 8 months ago
1
feat: TensorRT-LLM Inflight batching

#29 tikikun opened 8 months ago
1
Github CI windows for tensorrt_llm engine

#28 hiro-v closed 3 months ago
2
bug: tensorRT - Switching between model is causing error satisfyProfile Runtime dimension does not satisfy any optimization profile

#27 Van-QA closed 2 months ago
2
Friction Report: Using TensorRT-LLM on Windows

#26 dan-homebrew closed 8 months ago
1
feat: Ultilize `free_gpu_memory_fraction` to control max VRAM consumption

#25 hiro-v closed 3 months ago
2
feat: Add exit method

#24 tikikun closed 8 months ago
1
Chore: Update README

#23 hahuyhoang411 closed 8 months ago
0