michaelrsweet / htmldoc

HTML Conversion Software
https://www.msweet.org/htmldoc
GNU General Public License v2.0
212 stars 47 forks source link

AddressSanitizer: out of bounds memory write in parse_paragraph functionin,htmldoc/htmldoc/ps-pdf.cxx:5208 #528

Closed WhereisDoujo closed 3 months ago

WhereisDoujo commented 3 months ago

Hello, I found an out of bounds memory write in parse_paragraph function,ps-pdf.cxx

Reporter: WhereisDoujo from Ocean University of China

test platform: htmldoc Version :current OS :Kali 6.6.9-1kali1 (2024-01-08) kernel: 6.6.9-amd64

reproduced:

(htmldoc with asan build option) ./htmldoc -f 1.pdf ./poc.html poc.zip

AddressSanitizer:DEADLYSIGNAL
=================================================================
==4357==ERROR: AddressSanitizer: SEGV on unknown address 0x55852f54cb40 (pc 0x55852f4bc7a6 bp 0x7ffca6baca10 sp 0x7ffca6bac7c0 T0)
==4357==The signal is caused by a WRITE memory access.
    #0 0x55852f4bc7a6  (/usr/local/bin/htmldoc+0x19d7a6) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #1 0x55852f488212  (/usr/local/bin/htmldoc+0x169212) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #2 0x55852f489376  (/usr/local/bin/htmldoc+0x16a376) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #3 0x55852f489376  (/usr/local/bin/htmldoc+0x16a376) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #4 0x55852f489376  (/usr/local/bin/htmldoc+0x16a376) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #5 0x55852f4863fe  (/usr/local/bin/htmldoc+0x1673fe) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #6 0x55852f4863fe  (/usr/local/bin/htmldoc+0x1673fe) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #7 0x55852f4863fe  (/usr/local/bin/htmldoc+0x1673fe) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #8 0x55852f47a958  (/usr/local/bin/htmldoc+0x15b958) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #9 0x55852f453b19  (/usr/local/bin/htmldoc+0x134b19) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
    #10 0x7f6b9382bd8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 490fef8403240c91833978d494d39e537409b92e)
    #11 0x7f6b9382be3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 490fef8403240c91833978d494d39e537409b92e)
    #12 0x55852f369944  (/usr/local/bin/htmldoc+0x4a944) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/local/bin/htmldoc+0x19d7a6) (BuildId: 9b38309ddd5aa38458dae5422caef06a503695df)
==4357==ABORTING

this bug in htmldoc/htmldoc/ps-pdf.cxx:5208

for (dataptr = temp->data; *dataptr; dataptr ++)
        *dataptr = dataptr[1];
      *dataptr = '\0';
michaelrsweet commented 3 months ago

[master 2d5b2ab] Don't strip leading whitespace from whitespace-only node (Issue #528)