DataKind-BLR / PrathamBooks-Sprint-2018

Code and documentation for the collaboration with PrathamBooks during Sprint' 2018
MIT License
4 stars 7 forks source link

Script to extract content from html pages and then merge pages for each story #13

Closed arnabbiswas1 closed 6 years ago

arnabbiswas1 commented 6 years ago

In this script I have used pandas to extract content from html pages (Issue #1 ) and then merge multiple pages for every story (#2 ), so that in the resultant csv, each row consists of complete content of a story.

@githubssn @deepakshankar94 @dnithinraj @SahilKuchlous Please review.

arnabbiswas1 commented 6 years ago

Thanks @ramyaragupathy !