koaning / embetter

just a bunch of useful embeddings
https://koaning.github.io/embetter/
MIT License
469 stars 15 forks source link

Finally start work on `prodigy-embetter` #66

Closed koaning closed 1 year ago

koaning commented 1 year ago
python -m prodigy textcat.emb.manual <dataset> <examples.jsonl> --labels --loader --anchors --exclusive
python -m prodigy image.clip.by_text <dataset> <examples.jsonl> --labels --loader --anchors --exclusive --remove-base64
python -m prodigy image.clip.by_image <dataset> <examples.jsonl> --labels --loader --anchors --exclusive --remove-base64
koaning commented 1 year ago

After working on the "frontpage" project, I think this is no longer the best way to go about this. Calculating the embeddings on the fly is expensive and it may be better to have a simple ANN index instead.