thegreatshasha / ailegal

The ai lawyer
4 stars 2 forks source link

Supreme Court Scraper Strategy #8

Open thegreatshasha opened 9 years ago

thegreatshasha commented 9 years ago

How to crawl supreme court judgements?

thegreatshasha commented 9 years ago

Have a giant hashmap to store data.

Strategy 1:

Downloading from supreme court website itself:

Goto http://judis.nic.in/supremecourt/DateQry.aspx

Enter from date as 01/Jan/1950 Enter to date as 32/Dec/2015

Open popup.

Once you get pagination links, open them if they already do not exist in link queue.

Skip the first link. Keep finding pagination links and keep adding them untill you reach last one (it should have the content ...)

Then click that link and repeat this process. Also grab links to all judgement.

Once done confirm that you have a list of links around 3000 in number.

Download these in whichever format you want and store and index them.