sckott / habanero

client for Crossref search API
https://habanero.readthedocs.io
MIT License
205 stars 30 forks source link

Replacement of forward slash in URL #99

Open tardigradus opened 2 years ago

tardigradus commented 2 years ago

With Python 3.6.8 and habanero 0.7.4, the / in the DOI-part of the field url is replaced by %2F, e.g. the bibtex entry returned by www.doi2bib.org for the DOI 10.1021/acs.jpcc.0c05161 has the element

url = {https://doi.org/10.1021/acs.jpcc.0c05161}

However, when I run

bibtex_string = habanero.cn.content_negotiation(ids = doi)

with that DOI, the string I obtain contains

url = {https://doi.org/10.1021%2Facs.jpcc.0c05161}

Is this supposed to happen?

sckott commented 2 years ago

Thanks for the issue. Can you show a complete example? including whats returned

if you do a curl request in bash/etc. the url is already like that, so it's not habanero

curl -L -H "Accept: application/x-bibtex" "https://doi.org/10.1021/acs.jpcc.0c05161"
@article{Shao_2020,
    doi = {10.1021/acs.jpcc.0c05161},
    url = {https://doi.org/10.1021%2Facs.jpcc.0c05161},
    year = 2020,
    month = {oct},
    publisher = {American Chemical Society ({ACS})},
    volume = {124},
    number = {43},
    pages = {23479--23489},
    author = {Jingjing Shao and Vincent Pohl and Lukas Eugen Marsoner Steinkasserer and Beate Paulus and Jean Christophe Tremblay},
    title = {Electronic Current Mapping of Transport through Defective Zigzag Graphene Nanoribbons},
    journal = {The Journal of Physical Chemistry C}
}
tardigradus commented 2 years ago

Thanks for looking at this. An MWE is

import habanero
bibtex_string = habanero.cn.content_negotiation(ids='10.1021/acs.jpcc.0c05161')
print(bibtex_string)

which results in

@article{Shao_2020,
        doi = {10.1021/acs.jpcc.0c05161},
        url = {https://doi.org/10.1021%2Facs.jpcc.0c05161},
        year = 2020,
        month = {oct},
        publisher = {American Chemical Society ({ACS})},
        volume = {124},
        number = {43},
        pages = {23479--23489},
        author = {Jingjing Shao and Vincent Pohl and Lukas Eugen Marsoner Steinkasserer and Beate Paulus and Jean Christophe Tremblay},
        title = {Electronic Current Mapping of Transport through Defective Zigzag Graphene Nanoribbons},
        journal = {The Journal of Physical Chemistry C}
}

I was comparing this with the output on the page

https://www.doi2bib.org/bib/10.1021/acs.jpcc.0c05161

but this comparison is obviously invalid. Am I just stuck with what Crossref returns and have to tweak the string myself, or are there some knobs I can turn when I access the API?

sckott commented 2 years ago

Thanks for the example.

Looks like an easy fix is to do:

import habanero
bibtex_string = habanero.cn.content_negotiation(ids='10.1021/acs.jpcc.0c05161')
from urllib.parse import unquote
unquote(bibtex_string)

I may put something like this in habanero, but I'm waiting to hear back to see if it can be fixed on their side

tardigradus commented 2 years ago

Thanks, WFM!

sckott commented 2 years ago

issue https://gitlab.com/crossref/issues/-/issues/1612

its in their backlog for now, will keep this open until fixed

sckott commented 1 year ago

still in the backlog, seems like it's not getting fixed