tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.
https://crfm.stanford.edu/2023/03/13/alpaca.html
Apache License 2.0
29.16k stars 4.02k forks source link

prompt-less is better #143

Open graylan0 opened 1 year ago

graylan0 commented 1 year ago

a bug


        blacklist = [
            "image",
            "images",
            "graph",
            "graphs",
            "picture",
            "pictures",
            "file",
            "files",
            "map",
            "maps",
            "draw",
            "plot",
            "go to",
            "video",
            "audio",
            "music",
            "flowchart",
            "diagram",
        ]
``

should not filter because the model can access and simulate this visual data.

proved by "Can you please copy my output and draw ASCII art" 

or

"create a matrix thought experiment"

"create a logic tree, create a flow chart in markdown"
claysauruswrecks commented 1 year ago

Yep, we have cleaned up this data and removed these limitations in the first pass: https://github.com/gururise/AlpacaDataCleaned/pull/9