Scrape judgments with selenium

SriPrarabdha / LegalBrain-VectorSearch

4 stars 5 forks source link

Open SriPrarabdha opened 1 year ago

SriPrarabdha commented 1 year ago

Starter code is present in here

Run a loop over this code and download pdf of all judgments
Keep all those pdfs in a zip file or folder in your local machine do not push them to repo
Use from PyPDF2 import PdfReader to read all those pdf
Put togather the data in the JSON form as prescribed earlier
data = [{"id" : 43242 , "tagline" : " " , "date" : ,"judgment" : " "}]

devdhruvper commented 1 year ago

script for downloading is ready , now fetching part needs to be implemnted.