tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.
https://crfm.stanford.edu/2023/03/13/alpaca.html
Apache License 2.0
29.56k stars 4.06k forks source link

What about factual errors in `alpaca_data.json`? #114

Open jowagner opened 1 year ago

jowagner commented 1 year ago

Would PRs to main fixing errors in the output field of data set items be welcome? If so, the readme should clearly state that the data set is work in progress and researchers should refer to a specific commit number or release in their publications. Or would you prefer to collect corrections in a special branch?

Here an example of an error:

    {
        "instruction": "Find the second derivative of the following equation: y = 4x^2 + 3x \u2013 5",
        "input": "",
        "output": "The second derivative of the given equation is 8x + 3."
    },

That's the first derivative. The second one would be just 8.

(Also, one could argue that the use of EM DASH (U+2013) as a minus sign in the instruction field is an error. However, it's a good thing to train the model to be able to deal with such errors as they can be expected to occur in real life user inputs.)

In addition, or as an alternative, to fixing entries one could extend the data format to give each item an ID and author field, the latter set to self-instruct-on-text-davinci-003 for the initial data, a date field, an optional field derived_from referencing the ID of another item that the item is based on and derived_by that specifies in what way the new item is derived, e.g. fixing an error or varying some details such as the minus sign in the above example.

Building a simple web app for users who do not have the git/github skills but would like to contribute corrections or new/derived data set items may also pay off given the wide interest this technology currently receives. Or users can create a new issue (or add here while this issue is still open).

Opinions?

Edit: suggested value for author field Edit2: suggestion to post errors found in the data in an issue here

eccstartup commented 1 year ago
image
misterinterrupt commented 1 year ago
    {
        "instruction": "Illustrate the phrase \"no way out\".",
        "input": "No input required",
        "output": "![No Way Out](https://www.wallfoy.com/wallpaper/661/no-way-out.jpg)"
    },

URLs are not really guaranteed to be valid or safe, but maybe answering with random URLs is good enough for training these kind of things.

The URL in this data leads to some unexpected and sketchy looking redirects. Is this the intended illustration of the phrase? I can't imagine the experience of being trapped in an endless redirect was the intention here. If it was the intention, this artist work may be a better / safer example https://permanent-redirect.xyz/

side-note: what are the implications for theory of mind when a class of possible future 'intelligences' have been built on models trained with hyperlinks leading to phishing or other exploits? perhaps this will lead to unique evolutionary/survival traits or hidden aspects of personality. It probably just means that some future boston dynamics dog weapon could have a 0.000000001% chance of evaluating this part of its language model. Does that mean that it would end up transfering all of its money to a prince if the dissidents that it is hunting yell 'no way out' before being assassinated? careful