Open StephenOTT opened 11 years ago
Disclaimer page:
When selecting the "I have read and understood the above. Click here to enter the site." the url is appended with a cookie=t
using the cookie=t
parameter in the url allows a browser to by pass the Disclaimer page.
This is needed when trying to crawl the application
have not found a way to view all results in a single view.
See #3 for more details
When selecting a link from the list for search results or by selecting the Inspections link on the right side of the table, the following page appears (for the specific location/business)
Lost of information is found in XML feed that is not shown on the output.
XSLT is used to format the page.
Using Chrome if you open the Web Developer Tools and go to the Network tab. Open up the specific page, lets say the Search Results page:
If you are already on the page when you opened the Network tab in the developer tools, reload the page.
Scroll to the top of the list of Network items and select the item that reads "q.pl?ss=....."
Select the Response Table that appears
Then select all of the text that is in the response box and paste into a XML viewing tool like: http://xmlgrid.net/
Paste your xml and select submit:
You end up with a Tree view. Expand the tree a few levels down using the arrows:
Using the above, you can duplicate this process to explore the XML tree and understand the data structure
There are two XML responses that are processed:
The Search Results response and the Business/Facility Response:
Search Results Response:
Business/Facility Response:
Landing Page
http://app06.ottawa.ca/cgi-bin/search/inspections/q.pl?ss=home_en&qt=fsi_en