SriPrarabdha / LegalBrain-VectorSearch

4 stars 5 forks source link

Scrape judgments with selenium #13

Open SriPrarabdha opened 1 year ago

SriPrarabdha commented 1 year ago

Starter code is present in here

  1. Run a loop over this code and download pdf of all judgments
  2. Keep all those pdfs in a zip file or folder in your local machine do not push them to repo
  3. Use from PyPDF2 import PdfReader to read all those pdf
  4. Put togather the data in the JSON form as prescribed earlier
  5. data = [{"id" : 43242 , "tagline" : " " , "date" : ,"judgment" : " "}]
devdhruvper commented 1 year ago

script for downloading is ready , now fetching part needs to be implemnted.