Open wlinds opened 1 year ago
Orca Mini 3B implemented with GPT4All
https://huggingface.co/TheBloke/orca_mini_3B-GGML
Takes ~3 min to process ONE article with M1 processor.
This model is too general and should be replaced with a fine-tuned token completion or text summary model. We can browse Huggingface for this.
Use a pre-trained summary model to create summaries "offline" on your computer, eliminating API costs. However, researching, implementing, and getting this to work could be challenging. There are various summarization models with different capabilities. Some are specialized but may not produce different types of summaries, like technical/non-technical. Large language models (LLMs) can produce various summary types, but they might be too big to run on a laptop. Smaller LLMs may not produce high-quality summaries, but if we can prove the concept with a small model first, we can later upgrade to a larger model to improve summary quality.