[Feature request]: AI help + prompt search

stoerr commented 2 weeks ago

What do you need?

Since there are quite a lot of prompts and the functionality isn't trivial, I suggest that you build in an AI supported help. For instance in my https://aigenpipeline.stoerr.net/ I collected the documentation in the executable and allowed an argument --helpai that would send the documentation together with that question to ChatGPT (or whatever backend is configured) and let it give recommendations for usage and command line. Also, there are very many prompts, so it's difficult to find the prompt you're looking for. So the prompts could be summarized into one sentence and there could be a command line option that lets the user describe a prompt and sends all these prompt descriptions together with the description to an LLM and let it give recommendations which prompt to use. (One might also use embeddings etc. if the number of prompts is too large, or find some of the best prompts and filter the right one with a second step, but I guess that's a good start as it is.)

BTW: such prompts / questions are something that could easily be dictated as well. But that can admittedly also be done by an external tool like chatgptdictate.

atljoseph commented 2 weeks ago

I’m actually playing around with a way to get the main cli to suggest something today. Writing a bash script to have fabric output a summary/description markdown file for each pattern. Then using that file as metadata to send to ai to suggest a pattern name to use. Looking at the Pattern struct and it seems like there is an intended Description field for each pattern, but nothing to provide it. Thinking such a markdown file could file the void. FYI, all the text files from all the patterns ended up being 160,000+ tokens, so boiling it down to a description is gonna be key.

atljoseph commented 2 weeks ago

Try this. I spent a few hours on it today.

Add OptionalSummaryFile to db/db.go:

db.Patterns = &Patterns{
        Storage:                &Storage{Label: "Patterns", Dir: db.FilePath("patterns"), ItemIsDir: true},
        OptionalSummaryFile:    "summary.md",
        SystemPatternFile:      "system.md",
        UniquePatternsFilePath: db.FilePath("unique_patterns.txt"),
    }

Add/update this code to db/patterns:

import (
    "github.com/pieterclaerhout/go-waitgroup"
)
...
// GetPattern finds a pattern by name and returns the pattern as an entry or an error
func (o *Patterns) GetPattern(name string) (ret *Pattern, err error) {
    ret = &Pattern{
        Name: name,
    }

    ctx := context.Background()
    wg, _ := waitgroup.NewErrorGroup(ctx, -1)
    if err != nil {
        return nil, err
    }
    wg.Add(func() error {
        content, err := os.ReadFile(filepath.Join(o.Dir, name, o.SystemPatternFile))
        ret.Pattern = string(content)
        return err
    })
    wg.Add(func() error {
        content, err := os.ReadFile(filepath.Join(o.Dir, name, o.OptionalSummaryFile))
        if err != nil && !errors.Is(err, fs.ErrNotExist) {
            return err
        }
        ret.Description = string(content)
        return nil
    })
    if err := wg.Wait(); err != nil {
        return nil, err
    }
    return ret, nil
}
...
func (o *Patterns) GetAll() (ret []Pattern, err error) {
    mutex := &sync.Mutex{}
    names, err := o.Storage.GetNames()
    if err != nil {
        return nil, err
    }
    ctx := context.Background()
    wg, _ := waitgroup.NewErrorGroup(ctx, 5)
    ret = make([]Pattern, len(names))
    for i, name := range names {
        wg.Add(func() error {
            pattern, err := o.GetPattern(name)
            fmt.Println(pattern.Name)
            mutex.Lock()
            defer mutex.Unlock()
            ret[i] = *pattern
            return err
        })
    }
    if err := wg.Wait(); err != nil {
        return nil, err
    }
    return
}
...

Add this flag to cli/flags.go in the Flags struct:

Suggest                 string  `long:"suggest" description:"Provide a question or request so that a pattern may be suggested"`

Add this code to the cli/cli.go:

    // if the list all models flag is set, run the list all models function
    if len(currentFlags.Suggest) > 0 {
        suggestionRequest := currentFlags.Suggest
        patterns, err := fabricDb.Patterns.GetAll()
        if err != nil {
            return message, err
        }
        patterndata := make([]string, len(patterns))
        for i, p := range patterns {
            info := fmt.Sprintf("\n\nPATTERN_NAME: %s", p.Name)
            if len(p.Description) > 0 {
                info += fmt.Sprintf("\nPATTERN_DESCRIPTION:\n```\n%s\n```", p.Description)

            }
            patterndata[i] = info
        }

        currentFlags.Temperature = 0
        currentFlags.Message = fmt.Sprintf(`

==============================
# GOAL
==============================
You are a minimalist, but you listen very closely to small details.
Your main mantra is to provide the least amount of information to the user.
You value communicating through visual structure.
You are knowledgable on many topics, but you don't know everything.
As a natural researcher, you never tire of people asking your honest opinion.
Your job is to help the user to quickly land on a pattern to utilize.

==============================
# USER REQUEST FOR SUGGESTION
==============================
%[2]s

==============================
# PATTERNS - Do not suggest any pattern that is not provided below.
==============================
%[1]s

==============================
# FORMAT
==============================
- List a MAXIMUM of 5 suggestions.
- An entire suggestion should fit into a SINGLE line of text.
- Each suggestion should be PRECEEDED by a confidence percentage score.
- Suggestions with a score lower than 80%% should be discarded.
- Minimal output text. Percent and pattern name only.

        `, strings.Join(patterndata, "\n"), suggestionRequest)
    }

Generate summary.md file for each pattern w/ this bash script:

set -e

# EXAMPLE USAGE: bash ./generate_pattern_summaries.sh --custom
# EXAMPLE USAGE: bash ./generate_pattern_summaries.sh

# By default, assume we are in the repo, updating `summary.md` files in `patterns`` dir.
patternsdir=${1:-./patterns}

# Parse options, and possibly override patternsdir.                                                                                                                                                                                                                                                           
for arg in "$@"; do                                                                                                                                                                                                                                                          
    case $arg in                                                                                                                                                                                                                                                               
        --custom)             
            patternsdir="${HOME}/.config/fabric/patterns"    
            echo "Will update user config dir: $patternsdir"                                                                                                                                                                                                                                            
            shift # Remove flag from processing.                                                                                                                                                                                                                        
            ;;                                                                                                                                                                                                                                                                     
        *)                                                                                                                                                                                                                                                                       
            patternsdir="$arg"                                                                                                                                                                                                                                                     
            ;;                                                                                                                                                                                                                                                                     
    esac                                                                                                                                                                                                                                                                       
done 

# Get summaries from AI.
outfile_basename="summary.md"
for dir in "$patternsdir"/*; do                                                                                                                                                                          
    if [ ! -d "$dir" ]; then   
        continue                                                                                                                                                                
    fi
    patternname="$(basename "$dir")"
    patternbanner="$(echo -e "\n\n\n=======================\nPATTERN: $patternname\n=======================")"
    echo "$patternbanner"
    echo "DIR: $dir"                                                                                                                                                                     
    patterncontent="$(echo -e "$patternbanner")"                                                                                                                                                                             
    count=0
    for file in "$dir"/*; do
        # if [ $count -ge 2 ]; then
        #     break
        # fi
        file_basename="$(basename "$file")"
        if [[ ! -s "$file" ]] || ! grep -q '[^[:space:]]' "$file"; then                                                                                                                                                                                                              
            echo -e "SKIPPED: $file is empty."
            continue
        fi 
        if [ "$(echo "$file_basename" | tr '[:upper:]' '[:lower:]')" == "$outfile_basename" ]; then
            echo -e "SKIPPED: $file will be overwritten."
            continue
        fi 
        echo -e "IN_FILE: $file"    
        patterncontent+="\n\n$(echo -e "<<<PATTERN_FILE_START pattern=\"$patternname\" name=\"$file_basename\">>>\n$(cat "$file")\n<<<PATTERN_FILE_END>>>")"  
        count=$((count + 1))
        sleep 0.5
    done     
    outfile="$dir/$outfile_basename"
    echo "OUT_FILE: $outfile"
    echo -e "$patterncontent"
    # summarize_prompt
    patternsummary="$(fabric --pattern summarize_prompt "$patterncontent")"
    echo -e "OUT_SUMMARY:\n$patternsummary"
    echo -e "# $patternname\n\n$patternsummary" > "$outfile"                                                                                                                                                                               
done

Then, run one of these:

go run . -s --suggest "how do i find out about batman"

fabric -s --suggest "how do i find out about batman"

Example output:

Here are five suggestions based on the provided patterns:

95% create_academic_paper
90% extract_wisdom_dm
85% summarize_git_diff
92% write_nuclei_template_rule
88% improve_report_finding

It isn't perfect. Needs improved prompting. But its something to build on.

Sorry, wasn't feeling up to doing all the committing... This is copy-pasta-bowl, though :)

@danielmiessler or @johnconnor-sec

stoerr commented 1 week ago

Nice! Some more ideas:

Another way to include such metadata about prompts would be to use markdown front matter, where you could also include other stuff like tags, creation date, whatnot. That could look like this:
```
---
title: "someprompt"
date: 2024-09-03
tags:
- Markdown
- Front Matter
---
here comes the actual markdown for the file
```
That'd only work for future prompts, though, and makes most sense if the prompt writer wants to specify something there and everything is mostly in one file. There could also be a one line description, a paragraph about what it's for (useable both for manual inspection as for the prompt suggestion itself). But of course that can also be in a separate file. If you introduce that you can also wonder whether there will be more metadata in the future - e.g. tags might be a nice idea, too - and prepare a place for such future extensions.
Personally, I'd include the description or at least a one sentence / one line description of the prompt into the file, or maybe a reason why the prompt seems to fit instead of that. The directory name isn't that much information. Generating a reason might even improve the rating (CoT).
If such summaries are AI generated from changing content (in case the prompt evolves significantly), then regenerating the summary can either be a manual operation or automatically. That's something I wrote the mentioned AI Generation Pipeline for - the SHA256 hash of the input file(s) is saved in a comment in the AI generated output. That's an idea you could use here, too - but if prompt don't evolve that much, then that doesn't seem too necessary.

atljoseph commented 1 week ago

I didn’t focus on the prompt. Just on the doing part. I also didn’t do it with your project in mind, sorry. Not familiar with it. Too many hours in the day. Did it the simplest way based on the fabric code that was already written.

Tools like this deserve to have users not sift through all the patterns manually.

Also the tool you mentioned seems to be a competing tool to fabric… is that an accurate statement?

atljoseph commented 1 week ago

Checked it out. Pretty cool tool. Written in Java, right? Somewhere between fabric and aider, but that comparison doesn’t do it justice for sure.

stoerr commented 1 week ago

Well, a competing tool only in the sense of a tool that is using the LLM API for something, which is a pretty large space. There are quite many tools which have their own focus. Besides the LLM backend access, fabric provides extensive library of excellent prompts, which none of my tools do. The aigenpipeline focuses on processing a number of files by LLM using a prompt or going through a chain of AI generated files that depend on each other, while making sure reprocessing does only occur when necessary because of input or prompt changes. There are others, like swiss knife type toolsuite llm , my toolsuite for various OpenAI services, aider and my Co-Developer GPT engine (which is conceptually pretty close to aider but done as a GPT for OpenAI's ChatGPT), and that's just naming what I personally regularly use. Those can even interoperate - if you add an option to just output a prompt to stdout or a file you could plugin fabric easily to any of the other tools I mentioned. Come to think of it, I'll write a suggestion for that, as it'd be really nice to have aigenpipeline run prompts from fabric or pull a fabric prompt into a CoDeveloper GPT session in ChatGPT.

Thanks so much for providing this super nice tool!

danielmiessler / fabric

[Feature request]: AI help + prompt search #904

What do you need?