latitudegames / AIDungeon

Infinite adventures await!
http://www.aidungeon.io/
MIT License
3.18k stars 555 forks source link

[BUG] Buffer overflow when text+remember_text gets too long? #251

Open Sporking opened 4 years ago

Sporking commented 4 years ago

🐛 Bug Report

I have been seeing signs of what appears to possibly be a buffer overflow or similar correctness problem which seems to occur when the length of submitted text (the N past entries plus the length of the 'remember' text) exceeds some threshold. The symptom is that the output seems to start OK, but then suddenly degenerates into garbage that seems to have nothing to do with either the submitted text or the remembered text. For instance, if I have a very long set of text in my "remembered" text, then I can get responses like these:

> !"My will is set." you say, with resolution.

Welcome to... The world is full,...(A. Your life, the world is a...The world is tooThe world is. The world is The world is > The world is The world is The world Is The world Is Not.

If I revert the last response and try again, something very similar happens again:

> !"I don't know how to leave." you respond.

The priest looks... The priest nods.

>

"Welcome to playfully, you say'The priest smiles. (...It's...The priest sighsThis is very:As—THE The priest nods. It"

Note that "Welcome to" seems to be a valid way to start a response to my statement, but that after that, the text devolves into gibberish. The type of gibberish involved often seems to involve strange characters like brackets or parentheses used in nongrammatical ways, and it often cuts phrases abruptly after a few words, only to start a different phrase, in ways that don't look like the kinds of mistakes that the AI tends to make. Some of the things I have seen seem possibly to result from either the AI trying to parse JSON-formatted data (as opposed to English text) directly (possibly due to reading past the length of submitted text into the JSON surrounding code that follows it), or from the Python code trying to read data out of a buffer past the point at which the buffer contains valid data, and hence getting whatever garbage might still be sitting in the memory there.

It is quite easy to reproduce this problem by using the "Remember" button and adding a large block of text. (In my case I added 70 lines of selected text from The King In Yellow, which does, to be fair, have a reputation for producing insanity in humans who read it and might be having a similar effect on AI Dungeon). After entering this "Remember" text, I attempt to use the AI in its usual manner. After only a few submitted entries, AI Dungeon starts outputting results in garbled, not-grammatical text. It appears to me that it is the total length of already-submitted text plus the length of the "remember" text that is the trigger: Using "revert" a few times temporarily resolves the problem until the text gets long again.

Some people have theorized that this problem is due to running out of memory, for instance in bug #68 or bug #138. It does not appear to me that these problems are likely "out of memory" problems. It appears to me that at least some of these are actually "buffer overflow or underflow" correctness problems.

Here are some hypotheses, that someone more familiar with Python and/or TensorFlow could perhaps investigate. It seems to me that this sort of problem smells a bit like a Python/C interfacing glitch, where the two sides disagree on what the maximum-allowed length of submitted text is, or on how to do bounds checking, or on how data is to be truncated if it is too long. Usually out-of-bounds reads or writes are difficult to achieve in a bounds-checked-by-default language like Python by itself (although invalid reads from circular buffers could certainly happen), but once native interfaces get involved, all bets are off.

As you can see, I lack enough specific expertise in the language and libraries being used here to fully diagnose the problem (and I don't have a fancy graphics card, so I can't really try to build this from source anyway), but it is fairly clear to me that there is a serious code correctness problem here. If I submit text to the AI that is too long for it to handle properly, it should just truncate the text I send (from the front, along whitespace boundaries) to ensure that the AI is given no more than it can handle at a time, and I should then get back somewhat reasonable output from the AI. Instead, if the submitted text is too long, I suddenly start getting total garbage which looks like the contents of a buffer which has been repeatedly overwritten with text and/or JSON constructs.

I am currently using the web interface, by the way, but I have seen similar things happen with the iPhone app.

Sporking commented 4 years ago

Here is another example:

"Yes, yes," the priest says. "If you would only leave this place, you could escape your fate." > !"I cannot. I must know." you respond. The priest looks... > The priest nods. "There is a:(The king is silent. [Still,the king will beImage"I amn'tThis is still...

Something interesting I noticed is that in all three of my examples, there is some initial text that looks reasonable, followed by an ellipsis ("..."), followed by garbagey text. This is odd, because neither the text I entered normally, nor the text in my "remember" text, contains an ellipsis. I wonder if there is some feature in TensorFlow and/or AI Dungeion and/or CPT-2 that is designed to truncate output if something exceeds an allowed length, and to place an ellipsis there to indicate that the rest of the text was truncated, and if perhaps the code that reads from the buffer is incorrectly reading past the ellipsis (possibly because no null terminator was placed after the ellipsis?) and displaying text that is not supposed to be part of the output.