knilssen / Founding_Fathers

Provides a reliable political feed for readers to become knowledgeable and informed voters on the state political level.
https://knilssen.github.io/Founding_Fathers
1 stars 1 forks source link

XMLSyntaxError: line 500: Tag footer invalid #20

Closed knilssen closed 6 years ago

knilssen commented 6 years ago
Traceback (most recent call last):
  File "/Users/kristiannilssen/anaconda/lib/python2.7/site-packages/newspaper/parsers.py", line 54, in fromstring
    cls.doc = lxml.html.fromstring(html)
  File "/Users/kristiannilssen/anaconda/lib/python2.7/site-packages/lxml/html/__init__.py", line 867, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/Users/kristiannilssen/anaconda/lib/python2.7/site-packages/lxml/html/__init__.py", line 752, in document_fromstring
    value = etree.fromstring(html, parser, **kw)
  File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src/lxml/lxml.etree.c:77737)
  File "src/lxml/parser.pxi", line 1830, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:116674)
  File "src/lxml/parser.pxi", line 1711, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:115220)
  File "src/lxml/parser.pxi", line 1051, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:109345)
  File "src/lxml/parser.pxi", line 584, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:103584)
  File "src/lxml/parser.pxi", line 694, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:105238)
  File "src/lxml/parser.pxi", line 633, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:104323)
XMLSyntaxError: line 500: Tag footer invalid
Failed Parse
You must download and parse an article before parsing it!
knilssen commented 6 years ago

Fixed, most likely caused by over requesting from source and when asked for html to scrape the article, it was not given. Accounted for and fixed if happens again