Open annapowellsmith opened 12 years ago
On scraping one particular .aspx page, mechanize consistently reports 'ParseError: unexpected '[' char in declaration' when accessing forms. Code in full:
url = 'http://corporate.marksandspencer.com/aboutus/where/international_stores' browser = mechanize.Browser() browser.open(url) browser.select_form(nr=0)
I have tried manually replacing the DTD, but it doesn't help:
url = 'http://corporate.marksandspencer.com/aboutus/where/international_stores' browser = mechanize.Browser() browser.open(url) html = browser.response().get_data().replace('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">','').replace('<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">','<html>') response = mechanize.make_response(html, [("Content-Type", "text/html")], INTERNATIONAL_URL, 200, "OK") browser.set_response(response) browser.select_form(nr=0)
On scraping one particular .aspx page, mechanize consistently reports 'ParseError: unexpected '[' char in declaration' when accessing forms. Code in full:
I have tried manually replacing the DTD, but it doesn't help: