cbanack / comic-vine-scraper

An add-on script for ComicRack that lets you copy details from Comic Vine into your comic books.
255 stars 48 forks source link

Dtd Error when parsing a lot of files #379

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
DESCRIBE THE PROBLEM:

As described here:

http://comicrack.cyolito.com/forum/8-help/38182-best-way-to-scrape-80k-of-comics
#39242

The user is getting a dtd error after scraping a few thousand files:

======> scraping next comic book: 'Avengers 028 (2014) (Digital) 
(Zone-Empire).cbz'
trying to match this book automatically...
ERROR OCCURRED CONTACTING COMICVINE. RETRYING...
------------------- PYTHON ERROR ------------------------
Caught SystemError: DTD ist in diesem XML-Dokument aus Sicherheitsgründen 
unzulässig. Zum Aktivieren der DTD-Verarbeitung müssen Sie die 
'DtdProcessing'-Eigenschaft für 'XmlReaderSettings' auf 'Parse' festlegen und 
die Einstellungen an die 'XmlReader.Create'-Methode übergeben.
Traceback (most recent call last):
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\scrapeengine.py", line 142, in scrape
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\scrapeengine.py", line 241, in _ScrapeEngine__scrape
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\scrapeengine.py", line 410, in _ScrapeEngine__scrape_book
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\automatcher.py", line 44, in find_series_ref
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\db.py", line 194, in query_issue_ref
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\cvdb.py", line 367, in query_issue_ref
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\cvconnection.py", line 116, in _query_issue_id_dom
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\cvconnection.py", line 222, in __get_dom
  File "C:\Users\julia_000\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\cvconnection.py", line 198, in __get_dom

WHAT VERSION OF COMICVINESCRAPER ARE YOU USING?

1.0.77

PLEASE PROVIDE ANY ADDITIONAL INFORMATION THAT MAY BE OF USE

The error translates to english:  

"For security reasons DTD is prohibited in this XML document. To enable DTD 
processing set the DtdProcessing property on XmlReaderSettings to Parse and 
pass the settings into XmlReader.Create method."

Original issue reported on code.google.com by cban...@gmail.com on 16 May 2014 at 4:23

GoogleCodeExporter commented 9 years ago
Relevant stack overflow link:

https://stackoverflow.com/questions/215854/prevent-dtd-download-when-parsing-xml

Original comment by cban...@gmail.com on 16 May 2014 at 4:23

GoogleCodeExporter commented 9 years ago

Original comment by cban...@gmail.com on 7 Jun 2014 at 8:11