issues
search
michaelfeil
/
infinity
Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.eu/infinity/
MIT License
977
stars
72
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Adding torch.compile + fp16 + bettertransformer a CLI argument
#122
michaelfeil
closed
4 months ago
0
Asking to truncate to max_length but no maximum length
#121
semoal
closed
4 months ago
1
Support for Inferentia2 (draft)
#118
michaelfeil
closed
3 months ago
1
Optimum windows fix
#117
michaelfeil
closed
4 months ago
1
Torch + Cuda + Bert crashes abruptly on startup
#115
semoal
closed
3 months ago
10
Parity break with OpenAI API: /models
#114
MichaelMcCulloch
closed
3 months ago
4
bump st
#113
michaelfeil
closed
4 months ago
1
update hf_transfer improvement
#112
michaelfeil
closed
4 months ago
1
Create llama-index `InfinityEmbeddings` as langchain
#111
semoal
opened
4 months ago
12
Benchmarking
#110
michaelfeil
closed
4 months ago
1
add revision to onnx
#109
michaelfeil
closed
4 months ago
0
How does this compare to Huggingface's Text Embedding Inference?
#108
alpayariyak
opened
4 months ago
10
cannot use rerank (BAAI/bge-base-en-v1.5)
#106
BlazJurisic
closed
4 months ago
1
fix: cli start
#105
michaelfeil
closed
4 months ago
0
Deps free
#104
michaelfeil
closed
4 months ago
1
ct2 bump
#103
michaelfeil
closed
4 months ago
1
add engine args similar to vllm
#102
michaelfeil
closed
4 months ago
1
pydantic-v1-backwards-fixes
#101
michaelfeil
closed
4 months ago
0
pydantic upgrade
#100
michaelfeil
closed
4 months ago
2
update poetry version + cache in ci
#99
michaelfeil
closed
4 months ago
1
422 error if /embeddings input is a string
#98
OlegIvaniv
closed
4 months ago
3
Torch dynamic shapes
#97
michaelfeil
closed
4 months ago
0
Update dependencies
#96
NirantK
closed
4 months ago
3
AWQ-Bert / 4-bit Bert
#95
michaelfeil
opened
5 months ago
2
AMD ROCm docker images support (+ optimization)
#94
michaelfeil
opened
5 months ago
10
Update tensorrt, onnxruntime, cuda base
#93
michaelfeil
closed
5 months ago
1
Return actual token count on forward pass
#92
michaelfeil
opened
5 months ago
1
Update README.md contribution guidelines
#91
michaelfeil
closed
5 months ago
1
How is long text handled?
#88
YanDavKMS
closed
5 months ago
0
Adding max token budget per batch
#87
michaelfeil
opened
5 months ago
0
starting to deprecated fastembed and ctranslate2
#86
michaelfeil
closed
5 months ago
1
unexpected keyword argument 'trust_remote_code'
#85
BSVogler
closed
5 months ago
3
adding revision
#84
michaelfeil
closed
5 months ago
1
Update Dockerfile to python 3.11 + CI fix
#83
michaelfeil
closed
5 months ago
1
support for `revision`
#82
michaelfeil
closed
5 months ago
1
support hf_transfer
#81
michaelfeil
closed
5 months ago
1
update torch 2.2+cu121
#80
michaelfeil
closed
5 months ago
1
update dstack support
#79
deep-diver
closed
5 months ago
1
rename device name
#78
michaelfeil
closed
5 months ago
0
bump sentence-transformers
#76
michaelfeil
closed
5 months ago
1
update dockerfile and tensorrt
#75
michaelfeil
closed
5 months ago
0
improvements optimum
#74
michaelfeil
closed
5 months ago
1
update docker
#73
michaelfeil
closed
5 months ago
1
update optimum_utils
#72
michaelfeil
closed
5 months ago
0
Arm poetry lock
#71
michaelfeil
closed
5 months ago
0
bum 0.0.18
#70
michaelfeil
closed
5 months ago
0
Revert "Add optimum[onnx]"
#69
michaelfeil
closed
5 months ago
0
Add optimum[onnx]
#68
michaelfeil
closed
5 months ago
1
Launch via Dstack
#66
michaelfeil
closed
5 months ago
1
Idea: add a parameter to configure number of decimals in JSON output
#64
lasttero
opened
5 months ago
3
Previous
Next