Strip of reliance on the transformer library

rhysdg / vision-at-a-clip

Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts

16 stars 1 forks source link

Closed rhysdg closed 2 months ago

rhysdg commented 2 months ago

Stripping back to a Huggingface transformers 'lite' so to speech with no hub, and only necessary/ adjusted modules
Work is underway at the following branch - https://github.com/rhysdg/vision-at-a-clip/tree/feat-onnx-tokenizer
inference is passing after stripping out all imports, reinstating pytests as we speak