CambioML / uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering

LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!
https://www.cambioml.com
Apache License 2.0
187 stars 56 forks source link

Add auto splitter advanced for huggingface config #220

Closed ZHIHANCHEN03 closed 8 months ago

goldmermaid commented 8 months ago

Great work @ZHIHANCHEN03 . One small comment, you can remove load_dotenv() for any huggingface notebook in the future.