danieljprice / giza

giza - a scientific plotting library for C/Fortran
https://danieljprice.github.io/giza/
GNU Lesser General Public License v3.0
32 stars 19 forks source link

_giza_parse_string corrupts stack #13

Closed olebole closed 5 years ago

olebole commented 6 years ago

In Debian, giza is used by the dpuser package as a pgplot replacement. Dpuser fails to build on arm64 platforms with a segmentation fault while generating the documentation with @graph.dpuser. This is reported as Debian#912167. The problem could be traced down to a bug in the latest giza 1.0.0. Quoting @egrimley from the bug:

The segfault happens when _giza_parse_string tries to return. The return address on the stack was corrupted by this call to cairo_get_current_point:

https://github.com/danieljprice/giza/blob/b6c1affdd20823f973d60f06d696e35b73a9942c/src/giza-scanner.l#L206-L208

If you add this assertion just before that line you should see the assertion fail:

assert(0 <= nGlyph && nGlyph < lenstr);
egrimley commented 6 years ago

It's a simple off-by-one error, isn't it? The "we can always do this" claim in the comment is only true if we declare positions[lenstr + 1] instead of positions[lenstr].

danieljprice commented 6 years ago

I think that’s the right answer here, I have pushed a fix. Please test to check this fixes the problem

On 31 Oct 2018, at 9:48 pm, Edmund Grimley Evans notifications@github.com wrote:

It's a simple off-by-one error, isn't it? The "we can always do this" claim in the comment is only true if we declare positions[lenstr + 1] instead of positions[lenstr].

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

haavee commented 6 years ago

Hi,

Well, that's odd ... because in theory (...) nGlyph should never be >= lenstr? Because that would imply that there are more glyphs being produced than characters in the input string.

Worst case: input string is only simple characters, thus each character would correspond to one glyph and thus nGlyph would end up == lenstr, but only after the last character had been processed; but by then the TOKEN_END should have already been seen, which would make the while( ... ) { } loop terminate, preventing the cairo_get_current_point(...) being executed with nGlyph out of bounds. (So far: theory)

With this I intend to say to, as far as I can tell, there is maybe something else happening than a "simple off-by-one", which I would like to get on top of.

Is there any input string that reproduces the segfault?

olebole commented 6 years ago

When running in the debugger (on arm64, since there I get the segfault), _giza_parse_string is called this way before:

Breakpoint 1, _giza_parse_string (text=text@entry=0xffffffffcce0 "0", width=width@entry=0xffffffffc920, 
    height=height@entry=0xffffffffc928, action=0xffffb77a0aa8 <_giza_action_get_size>)
    at giza-scanner.l:174

The loop is really run twice in this case; and nGlyph is increased here. token is set to GIZA_TOKEN_TEXT, so stop is still zero in the first pass. In the second pass, token will be set to GIZA_TOKEN_END and stop to one -- but only after the line mentioned above was executed.

haavee commented 6 years ago

OK so that's a string with only an ASCII "0" in by the looks of it?

olebole commented 6 years ago

Yes.

haavee commented 6 years ago

And that triggers the SIGSEGV ... interesting

olebole commented 6 years ago

Yes. By overwriting the stack (since positions is on the stack)

egrimley commented 6 years ago

Since the Debian maintainer is here: you might want to consider adding something like make maintainer-clean and ( cd src && flex giza-scanner.l ) to the Debian build script, so that you really build from source. Then, if you want to put this fix in Debian without waiting for a new upstream release, you could just add the +1 in giza-scanner.l.

olebole commented 6 years ago

I pragmatically fixed it in both places.

haavee commented 6 years ago

Right; it is an off-by-one! (phew)

The while loop counts tokens and worst case is: one token for each character and you always get an extra END token to mark, well, end-of-input. So the while loop does get executed lenstr+1 times (worst case).

FWIW: I would suggest the following changes:

danieljprice commented 6 years ago

Thanks Harro - I’ll put out a micro release with the bug fixed

On 1 Nov 2018, at 11:02 am, Harro Verkouter notifications@github.com wrote:

Right; it is an off-by-one! (phew)

The while loop counts tokens and worst case is: one token for each character and you always get an extra END token to mark, well, end-of-input. So the while loop does get executed lenstr+1 times (worst case).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

olebole commented 6 years ago

@danieljprice just as a wish: could you have a look on some of the other (pgplot compatibiliy) issues? This would help us to gain acceptance from the perl-pgplot people.

danieljprice commented 6 years ago

Yes, I have been talking with Karl Glazebrook (developer of perl-pgplot) directly, and asked him to put them on the issue tracker. I have successfully run the pgperl test suite and can reproduce most of the issues at least. Will try to knock them off one-by-one.

On 1 Nov 2018, at 7:06 pm, Ole Streicher notifications@github.com wrote:

@danieljprice just as a wish: could you have a look on some of the other (pgplot compatibiliy) issues? This would help us to gain acceptance from the perl-pgplot people.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

karlglazebrook commented 6 years ago

Fantastic! Will keep an eye on this

On 2 Nov 2018, at 9:15 am, Daniel Price notifications@github.com wrote:

Yes, I have been talking with Karl Glazebrook (developer of perl-pgplot) directly, and asked him to put them on the issue tracker. I have successfully run the pgperl test suite and can reproduce most of the issues at least. Will try to knock them off one-by-one.

On 1 Nov 2018, at 7:06 pm, Ole Streicher notifications@github.com wrote:

@danieljprice just as a wish: could you have a look on some of the other (pgplot compatibiliy) issues? This would help us to gain acceptance from the perl-pgplot people.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

haavee commented 6 years ago

Is that list of issues available somewhere?

olebole commented 6 years ago

@havee https://github.com/danieljprice/giza/issues

danieljprice commented 5 years ago

fixed in v1.1.0, closing this issue