rom1504 / embedding-reader

Efficiently read embedding in streaming from any filesystem
MIT License
92 stars 19 forks source link

build dedicated package, depending on this to create clip subset #27

Open rom1504 opened 2 years ago

rom1504 commented 2 years ago

same idea as https://github.com/rom1504/embedding-reader/blob/main/examples/clip_zero_shot_inference.py

API: getdataset <list of prompts> <output_folder> <embedding> <meta>

In 15h it gets you up to 5B sample of whatever you want