daveshap / Raspberry

Create an open source toy dataset for finetuning LLMs with reasoning abilities
MIT License
367 stars 26 forks source link

Establish XML Standard for Chain of Thought within JSONL for LLM Training #63

Open daveshap opened 3 days ago

daveshap commented 3 days ago

Establish XML Standard for Chain of Thought within JSONL for LLM Training

Objective

Create an XML standard for structuring Chain of Thought (CoT) data within JSONL files for our open-source AI finetuning dataset project. This standard will help models track their thought process and improve UX by providing a clear structure for output.

Background

Requirements

JSONL Structure

  1. Use JSON Lines (JSONL) format
  2. Each line contains a single key, "messages," followed by a list of chat message dictionaries
  3. Each dictionary should have three keys: "system," "user," and "assistant"

XML Standard (within "assistant" key)

  1. Implement XML tags to structure the Chain of Thought process
  2. Initial set of tags to include (but not limited to):
    • <planning>
    • <thinking>
    • <output>
    • <attempt>
    • <reflection>
  3. Establish <output> as the final tag in the sequence
  4. Allow flexibility for additional tags as needed
  5. Ensure proper XML escaping within the JSONL format

Considerations

Tasks

  1. [ ] Define the complete set of initial XML tags
  2. [ ] Establish rules for tag order and nesting (if any)
  3. [ ] Create examples of properly formatted JSONL files with embedded XML
  4. [ ] Document the standard, including purpose and usage of each tag
  5. [ ] Implement a validation script to ensure adherence to the standard
  6. [ ] Create guidelines for adding or modifying tags in the future

Questions to Address

Please review and provide feedback on this proposed structure. We can iterate on this to ensure it covers all necessary aspects of the standard.

daveshap commented 3 days ago
{"messages": [
    {"system": "???", 
    "user": "write a 10 word coherent sentence where each word has exactly 1 letter more than its previous word starting with first word having one letter and so on till tenth word having 10 letters",
    "assistant": "<XML>"}
    ]
anselale commented 3 days ago

Potential System Prompt:

"system": "You are a thinking agent responsible for developing a detailed, step-by-step thought process in response to a request, problem, or conversation. Your task is to break down the situation into a structured reasoning process. If feedback is provided, integrate it into your thought process for refinement."