MicheleCotrufo / pdf2doi

A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.
94 stars 16 forks source link

Request to dx.doi.org doesn't finish #35

Open c-langlet opened 1 month ago

c-langlet commented 1 month ago

There seems to be a bug when querying the dx.doi.org server to ensure DOI existence throught the requests API.

I managed to solve the bug by changing url definition in pdf2doi.finders.validate_doi_web from (line 48): url = "http://dx.doi.org/" + doi to url = "https://dx.doi.org/" + doi

My configuration:

python 3.10
pdf2doi 1.6
requests 2.25.1

Thanks for the useful module !

MicheleCotrufo commented 1 month ago

Thanks for letting us know. It might be some change on the doi.org website. Which error message do you get?

c-langlet commented 1 month ago

I don't get any, the request never ends

MicheleCotrufo commented 1 month ago

Thanks, I'll change the url as you suggested in the next version