Closed marilmanen closed 5 months ago
that's very interesting. Does it matter which .gz
file you try to open? Or is it one in particular? (If so, any chance you can somehow send me the offending file?).
I'll take a look, often a std::bad_alloc
on a 64-bit system means that something was given a negative size somewhere :-/
I modified the content of the spef file (replace all word characters to x) and it had no impact, so it looks like the only thing that matters is the size of the file. I also created a dummy file by duplicating following sequence until the uncompressed file was >728Mbytes and with it the I get the crash. After a couple iterations it looks like the limit for crash is very close to 715Mbytes
Here is the content that I used
*xxx
*xxxxx *xxxxxxx x.xxxxxxxxxx //xxxxxx x.xxx xxxxxx x.xxxxxxxxxx xx
*xxxx
*x *xxxxxx:xx x *x x *x xx.xxx xxx.xxx
*x *xxxxxx:x x *x x.xxxxxxxxx *x xx.xxx xxx.xxx
*x *xxxxxxx:x *x xx.xxx xxx.xxx
*x *xxxxxxx:x *x xx.xxx xxx.xxx
*x *xxxxxxx:x *x xx.xxx xxx.xxx
*xxx
x *xxxxxx:xx xx-xx
x *xxxxxx:x x.xxxxxxxxxx
x *xxxxxxx:x xx-xx
x *xxxxxxx:x x.xxxxxxxxxx
x *xxxxxxx:x x.xxxxxxx-xx
xx *xxxxxx:x *xxxxxxx:xx x.xxxxxxx-xx
xx *xxxxxx:x *xxxxxxx:xx x.xxxxxx-xx
xx *xxxxxx:x *xxxxxx:x x.xxxxx-xx
xx *xxxxxx:x *xxxxxx:x x.xxxxxxx-xx
xx *xxxxxx:x *xxxxxxx:x x.xxxxxx-xx
So, i made a script like this:
#!/bin/bash
IN=test.txt
OUT=test
FILE=test.gz
MINSIZE=728000000
COUNT=0
truncate -s 0 $OUT
while [[ 1 ]]; do
echo $COUNT
# append the source file 81920 times
cat $(yes $IN | head -81920) >> $OUT
gzip -f $OUT -c > $FILE
SIZE=$(wc -c <"$FILE")
if [ $SIZE -ge $MINSIZE ]; then
echo size is over $MINSIZE bytes
exit 0
fi
COUNT=$((COUNT+1))
done
To try to test, and I have a couple of questions:
OK, never mind that last comment. I mis-read it and thought that the .gz file needed to be a certain size, not the source file. I've replicated the issue and will see if I can fix it ASAP :-)
This is an interesting situation. it may not be obvious at first, but this is actually running into a circumstance where we are hitting the memory limit of what can be held in a QString
.
I was able to reproduce this with a very trivial Qt application that looks like this:
#include <QByteArray>
#include <QFile>
#include <QString>
#include <QtDebug>
int main() {
QFile file(QLatin1String("test.txt"));
if (file.open(QIODevice::ReadOnly)) {
QByteArray bytes = file.readAll();
QString text = QString::fromLocal8Bit(bytes.data(), bytes.size());
qDebug() << text;
}
}
with a file.txt
that is 1854668800
bytes big. Fundamentally, QString
is limited to ~2GB of storage, and the number of characters is at best half of that because they use UTF-16 (it can be less than half due to combining characters and similar). I know you triggered it with a smaller file, but I think it's essentially the same issue because some QString
operations require even more space temporarily.
Soi, back to nedit-ng
. Fortunately, we don't actually use QString
for file data that often, but we do currently use it for capturing stdout
and stderr
of subprocesses. In this case, I read the results of command you ran (in this case, gzip
) into a byte array, and then because it could be UTF-8, I decode it into a QString
(this is where it blows up), and finally, if all goes well, I convert it to a character buffer as needed.
I'll have to refactor the code to use a different approach since QString
has this limitation. I'll update when I have it worked out.
@marilmanen I believe that this PR should fix the issue, if it does, please let me know and I'll merge it into master. Thanks!
I tested with couple large files and no issues, so it looks like you have fixed the issue. Great!
I'm using following command to see what's inside a gz file
if the uncompressed file is 608Mbytes with 19M lines everything works fine, but with file size of 725Mbytes with 25M lines I get
There is no issue with the bigger file if it's first uncompressed to a file and then I open the file with nedit-ng. I have tested also old NEdit editor and there are no issues with it.