Open Ekhorn opened 7 months ago
tch crate can be used, see examples: https://github.com/LaurentMazare/tch-rs/tree/main/examples
The mnist example seems like what is needed here.
tch crate can be used, see examples: https://github.com/LaurentMazare/tch-rs/tree/main/examples
The mnist example seems like what is needed here.
Well the mnist example contains training code not basic inference code, see https://github.com/LaurentMazare/tch-rs/blob/main/examples/pretrained-models/main.rs instead.
pre-trained models can be found here https://github.com/onnx/models/tree/main/validated/vision/classification/mnist
Otherwise https://huggingface.co/models?sort=trending&search=mnist might also have something interesting.
This https://github.com/huggingface/candle/blob/main/candle-examples/examples/onnx/main.rs may be simpler to use for the moment to just inference.
After trying to work with candle, it seems that the library is not as flexible.
On https://www.arewelearningyet.com/neural-networks/ there is some more listed libraries.
Now there is also Burn which actually has a simple example on how to inference with an mnist.onnx
model https://github.com/tracel-ai/burn/blob/main/examples/onnx-inference/build.rs.
Another interesting one could be: https://github.com/sonos/tract
To inference through WGPU https://github.com/webonnx/wonnx might be interesting.
Description
Spaced would benefit from a image to text recognition feature, to select any part of the screen and collect the text present.
The model can be loaded on the backend and called to inference by calling a Tauri command. Taking a screenshot from any part of the screen could be part of Spaced, but often screenshot tools suffice and just pasting from the clipboard is good enough.
Requirements