It would be better to fetch document info like title, author, etc. by reading the first page and getting this info since for some pdfs which don't have those fields in the metadata PyPDF4 returns with empty values. Again maybe I'm misunderstanding the purpose of PyPDF but my impression was that this type of info was coming from reading the actual pdf and extracting the relevant info otherwise one could just use pdfinfo from pdftools to accomplish the same task, but as I mentioned the problem is when information like title and author are not in the metadata then one needs to read and extract that info from the pdf itself.
It would be better to fetch document info like title, author, etc. by reading the first page and getting this info since for some pdfs which don't have those fields in the metadata PyPDF4 returns with empty values. Again maybe I'm misunderstanding the purpose of PyPDF but my impression was that this type of info was coming from reading the actual pdf and extracting the relevant info otherwise one could just use pdfinfo from pdftools to accomplish the same task, but as I mentioned the problem is when information like title and author are not in the metadata then one needs to read and extract that info from the pdf itself.