jupyter-book / mystmd

Command line tools for working with MyST Markdown.
https://mystmd.org/guide
MIT License
219 stars 64 forks source link

Create a site content file designed for reading by LLMs #1647

Open choldgraf opened 1 week ago

choldgraf commented 1 week ago

LLMs are better at reading some kinds of content than others. In particular, they are highly optimized to read markdown. Making MyST documents easier to read by LLMs may allow our authors / readers to more effectively utilize MyST content as part of their workflows. Currently, there's not an easy way to do this without manually wrapping up a book's content into a Markdown file.

User story

As an author, I want to expose (and select) my MyST content so that it can be easily discovered and used in LLM workflows by readers.

An effort to standardize LLM text files for static sites

The Answer.AI (and FastAI) team have a proposal to standardize the use of llm.txt to signal to web scrapers the content that is "meant" for scraping by LLMs. We could follow this convention, or something similar to it, as part of the MyST build process.