instructlab / instructlab

InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.
https://instructlab.ai
Apache License 2.0
670 stars 247 forks source link

`lab generate` keeps repeating the same data #443

Closed shaneboulden closed 3 months ago

shaneboulden commented 4 months ago

Describe the bug I generated a seed_examples YAML file for a new grounded, compositional skill. The lab generate command keeps just outputting the same data.

Input given at the prompt

$ lab list
compositional_skills/programming/stackrox/generate_policy/qna.yaml

$ cat taxonomy/compositional_skills/programming/stackrox/generate_policy/qna.yaml
seed_examples:
  - question: |
      Generate me a StackRox policy that detects CVE-2021-44228 in a container image at build time
    answer:  |
      Here is a StackRox policy that detects CVE-2021-44228 in a container image at build-time:

      {
        "policies": [
            {
                "name": "Log4Shell: CVE-2021-44228 - log4j Remote Code Execution vulnerability",
                "description": "Alert on deployments with images containing the Log4Shell vulnerability (CVE-2021-44228). This is a flaw in the Java logging library Apache Log4j in versions from 2.0.0 and before as well as version 2.14.1.",
                "rationale": "This vulnerability allows a remote attacker to execute code on the server if the system logs an attacker-controlled string value with the attacker's JNDI LDAP server lookup.",
                "remediation": "Update the log4j libary to version 2.15.0 (which disables the feature by default), 2.16.0 (which removes the functionality) or later. If not possible to upgrade, there are two possible mitigations for this flaw in versions from 2.10 to 2.14.1: Set the system property log4j2.formatMsgNoLookups to true, or remove the JndiLookup class from the classpath.",
                "disabled": false,
                "categories": [
                    "Vulnerability Management"
                ],
                "fields": null,
                "lifecycleStages": [
                    "BUILD"                ],
                "eventSource": "NOT_APPLICABLE",
                "whitelists": [],
                "exclusions": [],
                "scope": [],
                "severity": "CRITICAL_SEVERITY",
                "enforcementActions": [],
                "notifiers": [],
                "lastUpdated": null,
                "SORTName": "",
                "SORTLifecycleStage": "",
                "SORTEnforcement": false,
                "policyVersion": "1.1",
                "policySections": [
                    {
                        "sectionName": "",
                        "policyGroups": [
                            {
                                "fieldName": "CVE",
                                "booleanOperator": "OR",
                                "negate": false,
                                "values": [
                                    {
                                        "value": "CVE-2021-44228"
                                    }
                                ]
                            }
                        ]
                    }
                ],
                "mitreAttackVectors": [],
                "criteriaLocked": true,
                "mitreVectorsLocked": true,
                "isDefault": true
            }
        ]
      }

  - question: |
      Generate me a StackRox policy that detects CVE-2021-4034 in a container image at build and deploy-time
    answer:  |
      Here is a StackRox policy that detects CVE-2021-4034 in a container image at build and deploy time:

      {
        "policies": [
          {
              "name": "PwnKit: CVE-2021-4034 - Polkit local privilege escalation vulnerability",
              "description": "Alert on deployments with images containing the PwnKit vulnerability (CVE-2021-4034). This is a local privilege escalation vulnerability impacting the pkexec binary.",
              "rationale": "This vulnerability allows an unprivileged local attacker to escalate privileges, bypassing any authentication and policies due to incorrect handling of the process argument vector.",
              "remediation": "Update the polkit package to 0.112-26.el7_9.1 (for RHEL 7-based container images) or polkit-0.115-13.el8_5.1 (for RHEL 8-based container images).",
              "disabled": false,
              "categories": [
                  "Vulnerability Management"
              ],
              "fields": null,
              "lifecycleStages": [
                  "BUILD",
                  "DEPLOY"
              ],
              "eventSource": "NOT_APPLICABLE",
              "whitelists": [],
              "exclusions": [],
              "scope": [],
              "severity": "CRITICAL_SEVERITY",
              "enforcementActions": [],
              "notifiers": [],
              "lastUpdated": null,
              "SORTName": "",
              "SORTLifecycleStage": "",
              "SORTEnforcement": false,
              "policyVersion": "1.1",
              "policySections": [
                  {
                      "sectionName": "",
                      "policyGroups": [
                          {
                              "fieldName": "CVE",
                              "booleanOperator": "OR",
                              "negate": false,
                              "values": [
                                  {
                                      "value": "CVE-2021-4034"
                                  }
                              ]
                          }
                      ]
                  }
              ],
              "mitreAttackVectors": [],
              "criteriaLocked": true,
              "mitreVectorsLocked": true,
              "isDefault": false
            }
          ]
      }

  - question: |
      Generate me a StackRox policy that detects the polkit package in a container image at build or deployment
    answer:  |
      Here is a StackRox policy that detects the polkit package in a container image at deployment:

      {
          "policies": [
              {
                  "name": "Polkit in Image",
                  "description": "Alert on deployments with Polkit present",
                  "rationale": "Leaving privileged administration tools like Polkit in an image potentially allows attackers to escalate privileges within the container.",
                  "remediation": "Use your package manager's \"remove\" command to remove polkit packages from the image build for production containers.",
                  "disabled": false,
                  "categories": [
                      "Security Best Practices"
                  ],
                  "fields": null,
                  "lifecycleStages": [
                      "DEPLOY"
                  ],
                  "eventSource": "NOT_APPLICABLE",
                  "whitelists": [],
                  "exclusions": [],
                  "scope": [],
                  "severity": "LOW_SEVERITY",
                  "enforcementActions": [],
                  "notifiers": [],
                  "SORTName": "",
                  "SORTLifecycleStage": "",
                  "SORTEnforcement": false,
                  "policyVersion": "1.1",
                  "policySections": [
                      {
                          "sectionName": "",
                          "policyGroups": [
                              {
                                  "fieldName": "Image Component",
                                  "booleanOperator": "OR",
                                  "negate": false,
                                  "values": [
                                      {
                                          "value": "polkit="
                                      },
                                      {
                                          "value": "policykit-1="
                                      }
                                  ]
                              }
                          ]
                      }
                  ],
                  "mitreAttackVectors": [],
                  "criteriaLocked": false,
                  "mitreVectorsLocked": false,
                  "isDefault": false
              }
          ]
      }

Then lab generate.

Response that was received

 cat taxonomy/train_merlinite-7b-Q4_K_M_2024-03-08T09\:42\:31.jsonl 
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}
{"system": "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", "user": "3. Question: What is the maximum size of a single container in the cluster?", "assistant": "The maximum size of a single container in the cluster is 16 GiB."}

Response that was expected Different permutations of the example JSON policies

xukai92 commented 3 months ago

looks like a bug to me but this sounds like a knowledge not a skill

shaneboulden commented 3 months ago

Thanks @xukai92!

I assumed that the knowledge to create formatted JSON files from example input would already be supported by the model. So, the skill then would be building new StackRox policies (simply JSON files) from the example policies provided.

Or is there more foundational knowledge required to support this skill?

xukai92 commented 3 months ago

the knowledge to create formatted JSON files from example input creating formatted JSON is more like a skill

if the model does not know what those policies are (e.g. "a StackRox policy that detects XXX in a container image"), it cannot invoke such a skill to create a JSON.

seems to me such policies are not common sense (correct me if I'm wrong) and are (domain-specific) knowledge. so to achieve what you want, you will probably need to provide a StackRox manual on these polices and add them via knowledge support. there is an experimental support from the CLI side you can play with (see https://github.com/instruct-lab/cli/pull/429#issuecomment-1985103225 but you don't need to pass --document any more)

xukai92 commented 3 months ago

we have official knowledge support on the CLI side now: https://github.com/instruct-lab/cli/releases/tag/v0.12.0 please give a try and reopen this issue if the problem still exists.