scraperwiki / spreadsheet-download-tool

A ScraperWiki plugin for downloading data from a box as a CSV or Excel spreadsheet
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Spreadsheet Download should handle row-spans in source grids #53

Open zarino opened 10 years ago

zarino commented 10 years ago

Row spans in source grids (eg: http://www.globaldairytrade.info/Results/WholeMilkPowderWMP.aspx) currently raise an assertion error:

screen shot 2013-11-08 at 15 01 09

It looks like someone knew rowspans wouldn't work, but didn't have time to handle them. Now's the time ;-)

pwaller commented 10 years ago

cc @drj11

drj11 commented 10 years ago

We knew rowspans didn't work, but had forgotten that we'd changed messytables and pdftables so that it was possible to generate them with suitable HTML input.

zarino commented 10 years ago

Somebody just came across this again: https://www.intercom.io/apps/63b0c6d4bb5f0867b6e93b0be9b569fb3a7ab1e3/messages/1084631/message_threads/1384747

dragondave commented 10 years ago

https://github.com/scraperwiki/messytables/blob/master/messytables/html.py#L94 resolves a similar issue, in parsing HTML colspans to CSV. Might be stealable.

On 14 January 2014 11:50, Zarino Zappia notifications@github.com wrote:

Somebody just came across this again: https://www.intercom.io/apps/63b0c6d4bb5f0867b6e93b0be9b569fb3a7ab1e3/messages/1084631/message_threads/1384747

— Reply to this email directly or view it on GitHubhttps://github.com/scraperwiki/spreadsheet-download-tool/issues/53#issuecomment-32258484 .

zarino commented 10 years ago

Another request to fix this:

https://www.intercom.io/apps/63b0c6d4bb5f0867b6e93b0be9b569fb3a7ab1e3/messages/1230652/message_threads/1583715