ocramz / xeno

Fast Haskell XML parser
Other
120 stars 33 forks source link

Segmentation fault: runtime crash in xeno #43

Closed swamp-agr closed 3 years ago

swamp-agr commented 4 years ago

Here are some details:

I guess, some unsafe functions are producing this unexpected effect, e.g.

qrilka commented 4 years ago

@swamp-agr any example of an XML giving this? It doesn't look to me that high rps could have anything to do with out of bounds access at the linked sites.

swamp-agr commented 4 years ago

@qrilka, let me reproduce the case with enabled dump of entire bytestring. I will come back once will be able to obtain such example.

swamp-agr commented 4 years ago

They contain some sensitive data I cannot share.

swamp-agr commented 4 years ago

I will try to reproduce the issue on some sample/open datasets.

ocramz commented 3 years ago

Hi @swamp-agr , any updates on this?

mgajda commented 3 years ago

@swamp-agr Can I please have an XML document to reproduce the issue?

swamp-agr commented 3 years ago

@ocramz @mgajda thank you for reaching me out.

There are two components required in order to reproduce the case:

While previously I was receiving different XMLs via http-client concurrently with explicit ResponseTimeout, currently I am able to emulate the case and reproduce segmentation fault via reading and parsing the parts of the same file in async mode.

I set up the repository which you could clone and try to reproduce by yourselves: https://github.com/swamp-agr/xeno-issue-43

Issue reproduced with both stack (GHC 8.8) and cabal (GHC 8.10).

swamp-agr commented 3 years ago

I also noticed that minor tweaks in example2.xml (i.e. removing some nodes from it) without any changes in code could make segfault gone.

qrilka commented 3 years ago

@swamp-agr what are those tweaks?

swamp-agr commented 3 years ago
  1. Remove first occurence of //Tracking@event="firstQuartile" (whole line).
  2. Remove first occurence of //Tracking@event="thirdQuartile" (whole line).

Output will become:

stack exec -- xeno-runner
errors: 10970
good: 422
swamp-agr commented 3 years ago

@ocramz @mgajda Did you have a time to confirm the case with information provided in this comment above? Could you please share techniques on how to debug segmentation faults in order to help with resolving this issue?

ocramz commented 3 years ago

Hi, currently I have no bandwidth to investigate this issue. sorry!

On Wed, 3 Mar 2021 at 23:02, Andrey Prokopenko notifications@github.com wrote:

@ocramz https://github.com/ocramz @mgajda https://github.com/mgajda Did you have a time to confirm the case with information provided in this comment above https://github.com/ocramz/xeno/issues/43#issuecomment-784567243?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ocramz/xeno/issues/43#issuecomment-790094332, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNBDKHCL3LUZL2FIJT6A33TB2WVVANCNFSM4S7VWUEA .

ingarsjekabsons commented 3 years ago

Hi there!

By accident I've stumbled upon strange xeno's segfaulting issue and I seem to have stable static XML payload with which parse is segfaulting - https://gist.github.com/ingarsjekabsons/5061bd84e0abc30aa78de5b59ecf9259

Seems that any tiny change to this xml makes segfault gone. Add some random characters in the comment section at the end - gone, remove some random element from the document - gone, etc.

Not sure I'm hitting exactly the same issue as reported here initially, though, since I'm using DOM parser, not SAX.

Will try to spend some time digging into more details.

adamse commented 3 years ago

I changed all the unsafe* functions to their safe variants and found that we we're calling unsafeGrow with a negative length, this caused the problem. https://github.com/ocramz/xeno/pull/48 should fix this, please give it a spin @ingarsjekabsons, @swamp-agr.

swamp-agr commented 3 years ago

@adamse With following cabal.project

packages: .

source-repository-package
    type: git
    location: https://github.com/adamse/xeno.git
    tag: f4a6f9f23415ed24aae8dd2529ea3788234044e1

I cannot reproduce the issue (segmentation fault):

cabal v2-exec -- xeno-runner
errors: 11514
good: 432
ingarsjekabsons commented 3 years ago

Can confirm, no more segfaults with known payload.

xeno@master:

*Xeno.SAX IO BS Xeno.DOM> d <- IO.openFile "bad.xml" ReadMode >>= BS.hGetContents
*Xeno.SAX IO BS Xeno.DOM> parse d
cabal: repl failed for xeno-0.4.2. The build process segfaulted (i.e.
SIGSEGV).

xeno@pull/43:

*Xeno.SAX IO BS Xeno.DOM> d <- IO.openFile "bad.xml" ReadMode >>= BS.hGetContents
*Xeno.SAX IO BS Xeno.DOM> 
*Xeno.SAX IO BS Xeno.DOM> parse d
Right (Node "Package" [("name","cs.bc.ebpp"),("version","")] [Text "\n    ",Element (Node "Dependency" [("name","cs.bc.ebpp_db"),("version","1.3.2")] []),Text "\n    ",Element (Node "Dependency" [("name","cs.bc.libutils"),("version","")] []),Text "\n    ",Element (Node "Dependency" [("name","cs.bc.libutilstux"),("version","")] []),Text "\n    ",Element (Node "Dependency" [("name","cs.bc.liborasql"),("version","")] []),Text "\n    ",Element (Node "Dependency" [("name","cs.ext.libxercesc"),("version","")] []),Text "\n    ",Eleme
...
...
...
ocramz commented 3 years ago

Thank you everyone, xeno-0.4.3 with the fix is on Hackage ^^