thchr / DOI2BibTeX.jl

MIT License
20 stars 2 forks source link

Output shortened wrongly? #7

Closed kellertuer closed 10 months ago

kellertuer commented 10 months ago

Thanks for providing this nice package – I am actually using it regularly. Today I stumbled upon a strange error namely for

julia> using DOI2BibTeX
julia> doi2bib("10.1016/j.jat.2007.03.002"; abbreviate=false)
 @article{popiel2007,

julia>

The output seems to be shortened wrongly somehow? I have not yet had the chance to dive too deep into this package, maybe you see easier what happens here.

thchr commented 10 months ago

I noticed this as well the other day but haven't gotten around to it yet.

I'm guessing this is either due to a change in what is "delivered" from the https request or some change in a more recent version of Julia. I guess the latter is not so likely.

kellertuer commented 10 months ago

Ok, I can try to find some time on the weekend to take a look how you get the data :)

thchr commented 10 months ago

I think the issue is that different fields used to be introduced with a \t but now it's a plain space - so my regex hacks fail. In the short term, we can do a new regex hack; in the longer term, it should be using a BibTeX parser.

kellertuer commented 10 months ago

Yeah, there is https://github.com/Humans-of-Julia/BibParser.jl but it is a bit inactive recently – and I also did not yet have the time to help there either.

thchr commented 10 months ago

I also have this unregistered parser: https://github.com/thchr/SimpleBibTeX.jl. Less sophisticated, but works well enough.

thchr commented 10 months ago

https://github.com/thchr/DOI2BibTeX.jl/commit/bc459c98e83b0aec55d23589b92122e223719456 should fix this, at least in the short term. Longer term, we do need to change to a BibTeX parser (now logged in #8).

Your example above is now no longer truncated (but the database gives an ugly title - but that's outside the scope of this package to fix):

julia> doi2bib("10.1016/j.jat.2007.03.002"; abbreviate=false)
WARNING: redefinition of constant PREPOSITIONS_CONJUNCTIONS_ARTICLES. This may fail, cause incorrect answers, or produce other errors.
@article{popiel2007bézier,
  title = {Bézier curves and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si1.gif" overflow="scroll"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math> interpolation in Riemannian manifolds},
  volume = {148},
  ISSN = {0021-9045},
  DOI = {10.1016/j.jat.2007.03.002},
  number = {2},
  journal = {Journal of Approximation Theory},
  author = {Popiel, Tomasz and Noakes, Lyle},
  year = {2007},
  pages = {111–127}
}

Compare with raw return value of the get request:

julia> println(DOI2BibTeX._doi2bib("10.1016/j.jat.2007.03.002"))
 @article{Popiel_2007, title={Bézier curves and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si1.gif" overflow="scroll"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math> interpolation in Riemannian manifolds}, volume={148}, ISSN={0021-9045}, url={http://dx.doi.org/10.1016/j.jat.2007.03.002}, DOI={10.1016/j.jat.2007.03.002}, number={2}, journal={Journal of Approximation Theory}, publisher={Elsevier BV}, author={Popiel, Tomasz and Noakes, Lyle}, year={2007}, month=oct, pages={111–127} }
thchr commented 10 months ago

It will be ~15 minutes before the new version is registered though.

kellertuer commented 10 months ago

Quick fix! Nice. Will check tomorrow :)