Improve parser detection of unhandled content

The parser now tracks all the tags it sees as it goes using tag IDs and then compares those to a list of IDs extracted using XPath. If there is a difference between the lists it throws an Exception.

There's also a number of parser improvements in here which were found in the process of making sure that it parsed things correctly:

Fix for the parser failing to pick up all the text if there is more than one hs_Para element inside a Question tag
Fixes broken table parsing code
Fixes missing some content inside division tags
Correctly handles clause tags to be part of the immediately following Amendment
Makes hs_2cDebatedMotion a major heading
Fixes missing some content inside new debate tags.

It also adds a script to make re-parsing easier.

Fixes #54 Fixes #66

mysociety / parlparse

Improve parser detection of unhandled content #80