argilla-io / argilla

Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
https://argilla-io.github.io/argilla/latest/
3.64k stars 343 forks source link

[DOCS] Add tutorials pages to new documentation #4951

Open davidberenstein1957 opened 2 months ago

davidberenstein1957 commented 2 months ago

Feel free to use at much as possible of these tutorials but it is also a good excuse to review and re-write.

Some things to keep in mind:

Good examples https://haystack.deepset.ai/tutorials/27_first_rag_pipeline

### Tasks
- [ ] Bootstrapping textcat with few-shot setfit and potentially sentence-tranformers for semantic search.
- [ ] bootstrapping spancat with zero-shot and few-shot gliner and NER.
- [ ] bootstrap project with LLMs with spacy-llm or other methods like llama-index, prompt engineering etc (feel free to choose what you want).
- [ ] Multi-model project with sentence tranformers and bulk labelling of images/PDFs etc
- [ ] Monitor for data, model, and annotator drift with BertTopic and text-descriptives.
- [ ] RAG: optimize retrievers and rerankers with haystack and sentence-transformers.
- [ ] RAG: optimize LLMs: haystack and trl.
- [ ] Instruction-tuning an LLM: SFT with TRL
- [ ] Preference tuning an LLM: DPO with TRL
nataliaElv commented 1 month ago

@davidberenstein1957 What should we do with this issue regarding the transfer?

burtenshaw commented 1 month ago

@nataliaElv move this to 2.1.