Glavin001 / Data2AITextbook

🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)
MIT License
25 stars 2 forks source link

Paraphrasing / Question Rewriting #4

Open Glavin001 opened 1 year ago

Glavin001 commented 1 year ago

Paraphrasing

Question Rewriting

Glavin001 commented 1 year ago

This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, and word order information for the problem of paraphrase identification.

Glavin001 commented 1 year ago

Evaluation