Open dgw opened 8 years ago
xmltodict.parse()
doesn't like HTML, and sadly XKCDB does not provide any official API. Scraping the HTML is the only way to get the content at the moment. There is an XML feed (RSS) of the latest quotes, but that allows neither selecting a random quote from the entire DB nor selecting a specific quote.
Unless I can talk the maintainer(s) of XKCDB into providing a proper API, this might have to be a CANTFIX. I'll at least poke through the sopel code to see how it handles HTML parsing if it's used anywhere in the core code or module set, since lxml
was completely dropped (sopel-irc/sopel@21bbd98e72eef4c5454211a7adb70c1c8e640845) from the install requirements.
It doesn't appear that sopel uses any HTML parsing anywhere, from a quick search through the repo on GitHub. There are a few modules that used to reference HTMLParser
, but don't appear to use it any more.
That said, it should be no big deal to switch from lxml
's HTML parser to HTMLParser
. Just a different refactor (and a need for importing sys
to check platform version, because the module was reorganized after Python 2.x).
With the merging of sopel-irc/sopel#923, none of sopel's core modules require
lxml
any more. Migrate away, preferably toxmltodict
like the core code has, so this module continues to have as few unique dependencies as possible.