ggerganov / ggml

Tensor library for machine learning
MIT License
10.37k stars 959 forks source link

ggml : implement a spellcheck model (xfspell, t5-spellchecker, etc) #233

Open walking-octopus opened 1 year ago

walking-octopus commented 1 year ago

Apple had recently announced a new transformer-based keyboard auto-correct and prediction.

xfspell seems to be an existing model that tried doing it, so why not investigate if it can be ported to GGML. If anyone know other models for predictive keyboard or auto correct, please drop your suggestions here.

Perhaps this may even be a good test case for on-device QLoRA fine-tuning.

High quality predictive keyboards and auto-correct in pure C++ can be a useful thing for open-source mobile operating systems like Ubuntu Touch and privacy-focused Android ROMs, because traditionally, such proposals got rejected because of excessive dependencies for ML inference.

ggerganov commented 1 year ago

Great idea - we should do that!

walking-octopus commented 1 year ago

It seems there are other, less niche models for spelling correction, like t5-spellchecker or other BERT-based models. Since there's been some work on T5 and there BERT.cpp (which does not yet support decoding), unless this model outperforms it in quality, ease of implementation, or resource usage, efforts can be directed to these two.

ggerganov commented 1 year ago

Ok, will add this to the roadmap to get some extra attention

gessha commented 1 year ago

I would like to give this a try.

SolsticeProjekt commented 11 months ago

While I was trying to figure out how to convert a small pytorch based model to ggml, I've found this thread.

I wanted to emphasize that small models (sub 1gig) exist, which provide great results for their specific tasks, without requiring multiple gigabytes of storage space and memory.

Thank you.

Ferruolo commented 3 months ago

I would like to finish this implementation, do any of the people who have already attempted have any recommendations?

lin72h commented 3 months ago

@Ferruolo please go head!

Ferruolo commented 3 months ago

Should changes go to LLAMA.cpp or GGML?

ggerganov commented 3 months ago

Depends on the interface that will be exposed, but I suppose the ggml repo would be more suitable

fairydreaming commented 3 weeks ago

I checked t5-base-spellchecker and it works with #8141:

./llama-cli -m /mnt/md0/models/t5-base-spellchecker.gguf -p 'christmas is celbrated on decembr 25 evry ear'

...
llama_output_reserve: reallocating output buffer from size 0.13 MiB to 2.13 MiB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
 christmas is celebrated on december 25 every year [end of text]

llama_print_timings:        load time =      44.46 ms
llama_print_timings:      sample time =       1.22 ms /    11 runs   (    0.11 ms per token,  9001.64 tokens per second)
llama_print_timings: prompt eval time =      59.88 ms /    18 tokens (    3.33 ms per token,   300.58 tokens per second)
llama_print_timings:        eval time =     140.48 ms /    10 runs   (   14.05 ms per token,    71.18 tokens per second)
llama_print_timings:       total time =     255.65 ms /    28 tokens
Log end
Green-Sky commented 2 weeks ago

just posted it here https://github.com/ggerganov/llama.cpp/issues/8204 , but there is now an example of deployed ggml spellchecking AND on-device finetuning !