DanMcInerney / xsscrapy

XSS spider - 66/66 wavsep XSS detected
1.63k stars 438 forks source link

ERROR: Error processing #36

Open atastycookie opened 7 years ago

atastycookie commented 7 years ago

Hey, I got this error on mac

2017-03-26 10:08:40 [scrapy.core.scraper] ERROR: Error processing
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/Users/d34dr00t/pentest/xsscrapy/xsscrapy/pipelines.py", line 61, in process_item
    unclaimedURL = self.unclaimedURL_check(body)
  File "/Users/d34dr00t/pentest/xsscrapy/xsscrapy/pipelines.py", line 218, in unclaimedURL_check
    tree = fromstring(body)
  File "/Library/Python/2.7/site-packages/lxml/html/__init__.py", line 876, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/Library/Python/2.7/site-packages/lxml/html/__init__.py", line 762, in document_fromstring
    value = etree.fromstring(html, parser, **kw)
  File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src/lxml/lxml.etree.c:79010)
  File "src/lxml/parser.pxi", line 1848, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:118341)
  File "src/lxml/parser.pxi", line 1736, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:117021)
  File "src/lxml/parser.pxi", line 1102, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:111265)
  File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105109)
  File "src/lxml/parser.pxi", line 706, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:106817)
  File "src/lxml/parser.pxi", line 644, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:105874)
XMLSyntaxError: line 444: ID  already defined (line 444)
decidedlygray commented 7 years ago

What options did you use to run it? Did you make any changes to the configuration file? Is this related to #37?

The above error looks like it's coming from the lxml library, maybe parsing malformed xml?

hoodoer commented 6 years ago

I see this as well:

2017-12-18 12:07:27 [scrapy.core.scraper] ERROR: Error processing Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 653, in _runCallbacks current.result = callback(current.result, *args, kw) File "/root/xsscrapy/xsscrapy-master/xsscrapy/pipelines.py", line 61, in process_item unclaimedURL = self.unclaimedURL_check(body) File "/root/xsscrapy/xsscrapy-master/xsscrapy/pipelines.py", line 218, in unclaimedURL_check tree = fromstring(body) File "/usr/lib/python2.7/dist-packages/lxml/html/init.py", line 876, in fromstring doc = document_fromstring(html, parser=parser, base_url=base_url, kw) File "/usr/lib/python2.7/dist-packages/lxml/html/init.py", line 762, in document_fromstring value = etree.fromstring(html, parser, **kw) File "src/lxml/etree.pyx", line 3230, in lxml.etree.fromstring (src/lxml/etree.c:81055) File "src/lxml/parser.pxi", line 1871, in lxml.etree._parseMemoryDocument (src/lxml/etree.c:121235) File "src/lxml/parser.pxi", line 1759, in lxml.etree._parseDoc (src/lxml/etree.c:119911) File "src/lxml/parser.pxi", line 1125, in lxml.etree._BaseParser._parseDoc (src/lxml/etree.c:114158) File "src/lxml/parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/etree.c:107723) File "src/lxml/parser.pxi", line 709, in lxml.etree._handleParseResult (src/lxml/etree.c:109432) File "src/lxml/parser.pxi", line 647, in lxml.etree._raiseParseError (src/lxml/etree.c:108489) XMLSyntaxError: line 66: ID space already defined (line 66)

I called: ./xsscrapy.py -u http://HOSTNAME:8080 --cookie="JSESSIONID=HMMMCOOKIES"

I haven't changed the configuration file.

decidedlygray commented 6 years ago

It probably won’t make a difference, but just in case. Does this work?

./xsscrapy.py -u http://HOSTNAME:8080 --cookie "JSESSIONID=HMMMCOOKIES"

(Just removed the “=“ after cookie, I think argparse knows how to handle either but worth a shot?)

hoodoer commented 6 years ago

That didn't make a difference I'm afraid, should have thought of trying that myself.

decidedlygray commented 6 years ago

Please try replacing ./xsscrapy/xsscrapy.py with: https://gist.github.com/decidedlygray/a865cd0acae071365e8965808ba6c89b

And replace ./xsscrapy/xsscrapy/pipelines.py with: https://gist.github.com/decidedlygray/f0727a63b7f68aae41155b0c90232d59

And provide the output

The above modules have some additional logging enabled that should help debug why the call to fromstring here https://github.com/DanMcInerney/xsscrapy/blob/master/xsscrapy/pipelines.py#L218 is failing.