harrisonlabollita / arXiv.jl

A Julia wrapper to arXiv API that generates .bib file for LaTex references.
MIT License
2 stars 2 forks source link

author edge cases #19

Closed harrisonlabollita closed 2 years ago

harrisonlabollita commented 3 years ago

Making a simple call

request("electron") gives a couple of weird edge cases. For example,

@article{Prendergast
2001Impa,
title={Impact of Electron-Electron Cusp on Configuration Interaction Energies},
author={David Prendergast
      Department of Physics and M. Nolan
      NMRC, University College, Cork, Ireland and Claudia Filippi
      Department of Physics and Stephen Fahy
      Department of Physics and J. C. Greer
      NMRC, University College, Cork, Ireland},
year={2001},
journal={arXiv:cond-mat/0102536v1},
url={http://arxiv.org/abs/cond-mat/0102536v1}
}

We can probably fix this one and as we catch more deal with those later.

lanceXwq commented 2 years ago

Another case is

@article{Kivelson
2001Elec,
title={Electron Fractionalization},
author={S. A. Kivelson
      UCLA},
year={2001},
journal={arXiv:cond-mat/0106126v1},
url={http://arxiv.org/abs/cond-mat/0106126v1}
}

It seems that this is triggered when the affiliations are write beside the list of authors (on the corresponding arXiv webpage). Then in our code, the string of authors becomes something like \n S. A. Kivelson\n UCLA\n, and after one strip we still have S. A. Kivelson\n UCLA as the first author.

Maybe having a second strip would solve it, but let me think a bit more.

There are also some other weird cases when a Chinese name is involved, not sure if it comes from the same bug. If not, I'll open another issue.

lanceXwq commented 2 years ago

Solved by 9953652.

lanceXwq commented 2 years ago

In general, this issue is caused by not handling arxiv extension elements correctly.