issues
search
michaelfeil
/
infinity
Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.32k
stars
97
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
add revision to onnx
#109
michaelfeil
closed
7 months ago
0
How does this compare to Huggingface's Text Embedding Inference?
#108
alpayariyak
opened
7 months ago
10
cannot use rerank (BAAI/bge-base-en-v1.5)
#106
BlazJurisic
closed
7 months ago
1
fix: cli start
#105
michaelfeil
closed
7 months ago
0
Deps free
#104
michaelfeil
closed
7 months ago
1
ct2 bump
#103
michaelfeil
closed
7 months ago
1
add engine args similar to vllm
#102
michaelfeil
closed
7 months ago
1
pydantic-v1-backwards-fixes
#101
michaelfeil
closed
7 months ago
0
pydantic upgrade
#100
michaelfeil
closed
7 months ago
2
update poetry version + cache in ci
#99
michaelfeil
closed
7 months ago
1
422 error if /embeddings input is a string
#98
OlegIvaniv
closed
7 months ago
3
Torch dynamic shapes
#97
michaelfeil
closed
7 months ago
0
Update dependencies
#96
NirantK
closed
7 months ago
3
AWQ-Bert / 4-bit Bert
#95
michaelfeil
opened
8 months ago
2
AMD ROCm docker images support (+ optimization)
#94
michaelfeil
opened
8 months ago
10
Update tensorrt, onnxruntime, cuda base
#93
michaelfeil
closed
8 months ago
1
Return actual token count on forward pass
#92
michaelfeil
closed
2 months ago
2
Update README.md contribution guidelines
#91
michaelfeil
closed
8 months ago
1
How is long text handled?
#88
YanDavKMS
closed
8 months ago
0
Adding max token budget per batch
#87
michaelfeil
opened
8 months ago
0
starting to deprecated fastembed and ctranslate2
#86
michaelfeil
closed
8 months ago
1
unexpected keyword argument 'trust_remote_code'
#85
BSVogler
closed
8 months ago
3
adding revision
#84
michaelfeil
closed
8 months ago
1
Update Dockerfile to python 3.11 + CI fix
#83
michaelfeil
closed
8 months ago
1
support for `revision`
#82
michaelfeil
closed
8 months ago
1
support hf_transfer
#81
michaelfeil
closed
8 months ago
1
update torch 2.2+cu121
#80
michaelfeil
closed
8 months ago
1
update dstack support
#79
deep-diver
closed
8 months ago
1
rename device name
#78
michaelfeil
closed
8 months ago
0
bump sentence-transformers
#76
michaelfeil
closed
8 months ago
1
update dockerfile and tensorrt
#75
michaelfeil
closed
8 months ago
0
improvements optimum
#74
michaelfeil
closed
8 months ago
1
update docker
#73
michaelfeil
closed
8 months ago
1
update optimum_utils
#72
michaelfeil
closed
8 months ago
0
Arm poetry lock
#71
michaelfeil
closed
8 months ago
0
bum 0.0.18
#70
michaelfeil
closed
8 months ago
0
Revert "Add optimum[onnx]"
#69
michaelfeil
closed
8 months ago
0
Add optimum[onnx]
#68
michaelfeil
closed
8 months ago
1
Launch via Dstack
#66
michaelfeil
closed
8 months ago
1
Idea: add a parameter to configure number of decimals in JSON output
#64
lasttero
closed
3 days ago
3
Update README.md
#63
michaelfeil
closed
8 months ago
0
Support e5-mistral-7b-instruct
#62
yuhon0528
closed
7 months ago
11
Update README.md
#61
michaelfeil
closed
8 months ago
0
Update to self-built sentence_transformers 2.2.3
#60
michaelfeil
closed
8 months ago
1
support mps backend.
#59
ninehills
closed
8 months ago
4
Support for Optimum Inference?
#58
jens-ghc
closed
8 months ago
13
Refactor batching to os.fork / multiprocessing
#57
michaelfeil
opened
9 months ago
0
Refactor: async usage in BatchHandler
#56
michaelfeil
opened
9 months ago
1
Support Apple Metal
#55
michaelfeil
closed
8 months ago
4
update dockerfile and tests
#54
michaelfeil
closed
9 months ago
1
Previous
Next