[BUG] Buffer overflow when text+remember_text gets too long?

🐛 Bug Report

I have been seeing signs of what appears to possibly be a buffer overflow or similar correctness problem which seems to occur when the length of submitted text (the N past entries plus the length of the 'remember' text) exceeds some threshold. The symptom is that the output seems to start OK, but then suddenly degenerates into garbage that seems to have nothing to do with either the submitted text or the remembered text. For instance, if I have a very long set of text in my "remembered" text, then I can get responses like these:

> !"My will is set." you say, with resolution.

Welcome to... The world is full,...(A. Your life, the world is a...The world is tooThe world is. The world is The world is > The world is The world is The world Is The world Is Not.

If I revert the last response and try again, something very similar happens again:

> !"I don't know how to leave." you respond.

The priest looks... The priest nods.

>

"Welcome to playfully, you say'The priest smiles. (...It's...The priest sighsThis is very:As—THE The priest nods. It"

Note that "Welcome to" seems to be a valid way to start a response to my statement, but that after that, the text devolves into gibberish. The type of gibberish involved often seems to involve strange characters like brackets or parentheses used in nongrammatical ways, and it often cuts phrases abruptly after a few words, only to start a different phrase, in ways that don't look like the kinds of mistakes that the AI tends to make. Some of the things I have seen seem possibly to result from either the AI trying to parse JSON-formatted data (as opposed to English text) directly (possibly due to reading past the length of submitted text into the JSON surrounding code that follows it), or from the Python code trying to read data out of a buffer past the point at which the buffer contains valid data, and hence getting whatever garbage might still be sitting in the memory there.

It is quite easy to reproduce this problem by using the "Remember" button and adding a large block of text. (In my case I added 70 lines of selected text from The King In Yellow, which does, to be fair, have a reputation for producing insanity in humans who read it and might be having a similar effect on AI Dungeon). After entering this "Remember" text, I attempt to use the AI in its usual manner. After only a few submitted entries, AI Dungeon starts outputting results in garbled, not-grammatical text. It appears to me that it is the total length of already-submitted text plus the length of the "remember" text that is the trigger: Using "revert" a few times temporarily resolves the problem until the text gets long again.

Some people have theorized that this problem is due to running out of memory, for instance in bug #68 or bug #138. It does not appear to me that these problems are likely "out of memory" problems. It appears to me that at least some of these are actually "buffer overflow or underflow" correctness problems.

Here are some hypotheses, that someone more familiar with Python and/or TensorFlow could perhaps investigate. It seems to me that this sort of problem smells a bit like a Python/C interfacing glitch, where the two sides disagree on what the maximum-allowed length of submitted text is, or on how to do bounds checking, or on how data is to be truncated if it is too long. Usually out-of-bounds reads or writes are difficult to achieve in a bounds-checked-by-default language like Python by itself (although invalid reads from circular buffers could certainly happen), but once native interfaces get involved, all bets are off.

Hypothesis 1: Buffer overread when submitting to TensorFlow There is a maximum valid length of the buffer (1024 characters I heard mentioned somewhere?) that can be validly submitted by the Python code to the C TensorFlow libraries it depends on. Perhaps if the length of the submitted data is exactly equal to the length of the buffer, or longer than the length of the buffer, then a null terminator is not written to the buffer, or it is written past the point at which the C library will look for it, and the result is that the C code reads the entire contents of the buffer and then continues reading garbage data after the end of the buffer until it eventually finds a null character.
Hypothesis 2: Buffer overread or overwrite when reading from TensorFlow When reading data back from the machine learning library, a null terminator character might not be found when the buffer is full, and the Python code would therefore continue reading past the end of the text that the AI generated, displaying whatever garbage data is located in that memory.
Hypothesis 3: Off-by-One errors when sending/receiving from TensorFlow If fixed-size arrays with a length field, rather than null-terminated strings, are being used to pass data back and forth between Python and the C library, perhaps there is an off-by-one error on the length checking being done, such that the Python code can submit a character longer (or shorter?) than what the C library is expecting to be legal, or vice versa, thus resulting in the C library possibly reading or writing extra garbage if the data being transmitted is near the maximum size.
Hypothesis 4: Violating TensorFlow's preconditions, and ignoring exceptions it throws Perhaps if text which is too long (1024 characters?) is submitted to TensorFlow, it fails to fill the return buffer and instead returns via a thrown exception (as seems to be implied by the stack trace that kik4444 supplied in bug #68) and the buffer that is supposed to hold whatever the AI generated is instead left holding whatever garbage it held before the call. If the exception is caught and ignored by AI Dungeon, then AI dungeon might try to read and display the garbage contents of the buffer. In such a case, the "right" solution would probably be to truncate the data (remove whitespace-separated tokens from the beginning until it is below the length limit) before supplying it to TensorFlow, so that an exception would not be thrown.

As you can see, I lack enough specific expertise in the language and libraries being used here to fully diagnose the problem (and I don't have a fancy graphics card, so I can't really try to build this from source anyway), but it is fairly clear to me that there is a serious code correctness problem here. If I submit text to the AI that is too long for it to handle properly, it should just truncate the text I send (from the front, along whitespace boundaries) to ensure that the AI is given no more than it can handle at a time, and I should then get back somewhat reasonable output from the AI. Instead, if the submitted text is too long, I suddenly start getting total garbage which looks like the contents of a buffer which has been repeatedly overwritten with text and/or JSON constructs.

I am currently using the web interface, by the way, but I have seen similar things happen with the iPhone app.

latitudegames / AIDungeon

[BUG] Buffer overflow when text+remember_text gets too long? #251

🐛 Bug Report