iryna-kondr / scikit-llm

Seamlessly integrate LLMs into scikit-learn.
https://beastbyte.ai/
MIT License
3.29k stars 269 forks source link

Long Documents for Summarization -> Zero-Shot Multi-Label Classification #73

Closed reversingentropy closed 11 months ago

reversingentropy commented 11 months ago

1) How do I summarize an exceptionally long, book-sized document for summarization? I want to create a summary and then use the LLM classifier on it. The "summary of summaries" approach takes too much time.

2) When working with a zero-shot multi-label classifier, are individual texts I want to classify treated as separate requests to the LLM API, or are multiple texts combined into one request? Specifically, when using list of lists , would all the text inputs be included in a single prompt, or would they remain separate? How do we manage text limits if they are all aggregated?