[DOCS] Add tutorials pages to new documentation

davidberenstein1957 commented 2 months ago

Feel free to use at much as possible of these tutorials but it is also a good excuse to review and re-write.

Some things to keep in mind:

Start by identifying a real-world problem and/or dataset. It shouldn't be a toy example but something someone might actually search for.
Explain this data and and model, to emphasize why they are used. e.g. GliNER is works for zero-shot NER but is costly to run inference, hence we can start and mover over to SpanMarker as cost-efficient few shot technique.
Evaluate usage and show results! We've optimized the RAG pipeline so we get better results. We've optimized a NER model so it now classifies X.

Good examples https://haystack.deepset.ai/tutorials/27_first_rag_pipeline

### Tasks
- [ ] Bootstrapping textcat with few-shot setfit and potentially sentence-tranformers for semantic search.
- [ ] bootstrapping spancat with zero-shot and few-shot gliner and NER.
- [ ] bootstrap project with LLMs with spacy-llm or other methods like llama-index, prompt engineering etc (feel free to choose what you want).
- [ ] Multi-model project with sentence tranformers and bulk labelling of images/PDFs etc
- [ ] Monitor for data, model, and annotator drift with BertTopic and text-descriptives.
- [ ] RAG: optimize retrievers and rerankers with haystack and sentence-transformers.
- [ ] RAG: optimize LLMs: haystack and trl.
- [ ] Instruction-tuning an LLM: SFT with TRL
- [ ] Preference tuning an LLM: DPO with TRL

nataliaElv commented 1 month ago

@davidberenstein1957 What should we do with this issue regarding the transfer?

burtenshaw commented 1 month ago

@nataliaElv move this to 2.1.

argilla-io / argilla

[DOCS] Add tutorials pages to new documentation #4951