rgarner / cma-tna-crawlers

Scraping old cases from TNA for CMA, no TLAs.
0 stars 3 forks source link

CA98 cases not getting body #28

Closed rgarner closed 9 years ago

rgarner commented 9 years ago

No body is being filled in for ca98 JSON.

We can see this because its JSON has only the title/original_url(s) and whatever was in the spreadsheet after augmentation - no trace of markup.

Perhaps CASE_DETAIL is not being matched in the crawler for these types of URLs? Only case list data available for, e.g. http://www.oft.gov.uk/OFTwork/competition-act-and-cartels/ca98/decisions/reckitt-benckiser