issues
search
rhysdg
/
vision-at-a-clip
Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts
16
stars
1
forks
source link
Bug/feat - SigLIP ful model handling
#4
Closed
rhysdg
closed
2 months ago
rhysdg
commented
2 months ago
Reworking usage to handle models that use cosine similarity with softmax, and a sigmoid loss scenario automatically
exposing text and image encoders for all models for manual usage
adding an
.inference
method allowing for automatic logits and probability handling per model
Environment
Ubuntu 22.04 - RTX 3080, 8-core
Incoming Changes
:
Gradio example
model warmup and benchmarks
deprecating Transformers library
.inference
method allowing for automatic logits and probability handling per modelEnvironment
Incoming Changes :