instructlab / taxonomy

Taxonomy tree that will allow you to create models tuned with your data
Apache License 2.0
178 stars 621 forks source link

Update knowledge yaml docs #635

Closed shivchander closed 5 months ago

shivchander commented 6 months ago

As per the discussions here: https://github.com/instruct-lab/cli/issues/702,

We have a new model on how to knowledge contributions should be formatted. Could we update the current knowledge branch of the taxonomy to reflect this.

shivchander commented 6 months ago

sample yaml structure:

created_by: ...
domain: ...
seed_examples:
- answer: |
    chat, download, generate, init, diff, serve, test and train
  question: |
    What sub-commands does `lab` has?
- answer: |
    `lab train` works locally for Linux (with or without GPUs) and MacBook with M chips. There is a notebook can be used to train and test the model on the cloud.
  question: |
    What options are there for training using InstructLab CLI?
- answer: |
    run `lab download`
  question: |
    How to download the model using the InstructLab CLI
- answer: |
    run `lab init`
  question: |
    how to initialize the workspace using the InstructLab CLI
task_description: 'How to the `lab` command for the InstructLab CLI'
document:
  repo: https://github.com/instruct-lab/cli
  commit: 951999a
  pattern: README.md
bjhargrave commented 6 months ago
document:
  repo: https://github.com/instruct-lab/cli
  commit: 951999a
  pattern: README.md

Does pattern allow for globbing? docs/**/*.md? Or multiple values ['file1.md', 'file2.md']?

xukai92 commented 6 months ago

yes, pattern will allow globbing. the intension is to have consistent way to allow point to either single file or multiple files under a folder