NTI-Gymnasiet-Nacka / projekt-1-tital00s-grupp-erik

projekt-1-tital00s-grupp-erik created by GitHub Classroom
0 stars 0 forks source link

PDF Metadata scraping #1

Closed Abishevs closed 7 months ago

Abishevs commented 8 months ago

Given a link to a url to an pdf. Downlod the pdf with tools like requests Then Scrape its metadata to extract Given data by #2

Libraries to consider but not limited to:

Abishevs commented 7 months ago

METADATA scraping pypdf2

EddieE84 commented 7 months ago

Vissa sidor har embeded PDFs. Då finns det info som HTMl om vem har gjort, kanske titlen och sånt