luser-dr00g / xpost

A PostScript interpreter in C
Other
93 stars 12 forks source link

xpost_file_read(), and hence readstring, doesn't deal with end of file #50

Closed PertinentDetail closed 2 years ago

PertinentDetail commented 2 years ago

readstring effectively boils down to a call to xpost_file_read() and xpost_file_read() doesn't do anything special when it reaches end of file. The function is:

int xpost_file_read(char *buf, int size, int count, Xpost_File *fp)
{
    int i, j, k = 0;

    for (i = 0; i < count; ++i)
        for (j = 0; j < size; ++j)
            buf[k++] = xpost_file_getc(fp);

    return i;
}

This is guaranteed to return count.

There needs to be a test on whether xpost_file_getc() has returned EOF, and, if it has, the code needs to break out of the i loop. Note that this will need to test the return value of xpost_file_getc() before it's written into buf[], for the usual getc-returns-an-int reasons.

A simple test input is:

{currentfile 10 string readstring pstack} exec
abc

This should return substring (abc\n) (could be "\n", "\r\n" or nothing at the end depending exactly how the file terminates) and false. xpost returns a ten character string padded with character 255 and true

Here's are TIO links for the simple test: xpost (failing) and Ghostscript (working).

A stronger test is:

{currentfile 10 string readstring exch print not {exit} if} loop
This is several lines
of text, these should be
printed exactly as they
appear in the input and
should end just after the
end of this sentence, with
any newlines that were in
the input.

Currently this will give an infinite loop with xpost.

With this bug fixed, it's not clear what xpost_file_read() should do if the number of bytes remaining in the file is not a multiple of size (assuming it's also less than count × size). It will read all the remaining bytes and return the number of whole items it read. This will imply it read i × size bytes when it actually read up to size − 1 more bytes. This is probably correct behaviour, but at least should be documented in the code.

While I'm here, in xpost_op_file.c, the comments documenting the inputs and outputs of readhexstring, readstring and readline are wrong. They all imply that if the functions hit end of file then they return just false without a substring. In fact, if they always return a substring. The code is correct in all three cases. (Note that read has a similar comment, but in that case it's correct, as is the code).

luser-dr00g commented 2 years ago

Good catch. Very helpful analysis.

I have the fix written and done some interactive testing. I'm hesitating to push yet because I want to add a test to data/test.ps but I'm not seeing a nice way to test the EOF behavior just using currentfile so it'll need a few extra files. Not really a problem but that's where I stopped to think before doing anything further.

luser-dr00g commented 2 years ago

Fixed. Tests added to data/test.ps and data/readstring.ps.