inspirehep / inspire-next

The INSPIRE repo.
https://inspirehep.net
GNU General Public License v3.0
59 stars 69 forks source link

Authorlist failure #2964

Closed kaplun closed 6 years ago

kaplun commented 6 years ago

Expected Behavior

Authors and affiliations are well parsed and exported

Current Behavior

An error is presented to the user.

Steps to Reproduce (for bugs)

  1. Go to: https://qa.inspirehep.net/tools/authorlist
  2. Enter:
    
    A. J. Hawken1, B. R. Granett1, A. Iovino1, L. Guzzo1,2, J. A. Peacock12, S. de la Torre4, B. Garilli3, M. Bolzonella8, M. Scodeggio3, U. Abbas5, C. Adami4, D. Bottini3, A. Cappi8,17, O. Cucciati15,8, I. Davidzon4,8, A. Fritz3, P. Franzetti3, J. Krywult13, V. Le Brun4, O. Le Fèvre4, D. Maccagni3, K. Małek14,19, F. Marulli15,16,8, M. Polletta3, A. Pollo18,19, L. A. M. Tasca4, R. Tojeiro10, D. Vergani20,8, A. Zanichelli21, S. Arnouts6, J. Bel7, E. Branchini9,22,23, G. De Lucia11, O. Ilbert4, L. Moscardini15,16,8 and W. J. Percival10

1 INAF–Osservatorio Astronomico di Brera, via Brera 28, 20122 Milano, via E. Bianchi 46, 23807 Merate, Italy 2 Università degli Studi di Milano, via G. Celoria 16, 20130 Milano, Italy 3 INAF – Istituto di Astrofisica Spaziale e Fisica Cosmica Milano, via Bassini 15, 20133 Milano, Italy 4 Aix-Marseille Université, CNRS, LAM (Laboratoire d’Astrophysique de Marseille) UMR 7326, 13388 Marseille, France 5 INAF–Osservatorio Astronomico di Torino, 10025 Pino Torinese, Italy 6 Canada-France-Hawaii Telescope, 65–1238 Mamalahoa Highway, Kamuela, HI 96743, USA 7 Aix-Marseille Université, CNRS, CPT, UMR 7332, 13288 Marseille, France 8 INAF–Osservatorio Astronomico di Bologna, via Ranzani 1, 40127 Bologna, Italy 9 Dipartimento di Matematica e Fisica, Università degli Studi Roma Tre, via della Vasca Navale 84, 00146 Roma, Italy 10 Institute of Cosmology and Gravitation, Dennis Sciama Building, University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX, UK 11 INAF–Osservatorio Astronomico di Trieste, via G. B. Tiepolo 11, 34143 Trieste, Italy 12 Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ, UK 13 Institute of Physics, Jan Kochanowski University, ul. Swietokrzyska 15, 25-406 Kielce, Poland 14 Department of Particle and Astrophysical Science, Nagoya University, Furo-cho, Chikusa-ku, 464-8602 Nagoya, Japan 15 Dipartimento di Fisica e Astronomia, Alma Mater Studiorum Università di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy 16 INFN, Sezione di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy 17 Laboratoire Lagrange, UMR 7293, Université de Nice Sophia Antipolis, CNRS, Observatoire de la Côte d’Azur, 06300 Nice, France 18 Astronomical Observatory of the Jagiellonian University, Orla 171, 30-001 Cracow, Poland 19 National Centre for Nuclear Research, ul. Hoza 69, 00-681 Warszawa, Poland 20 INAF–Istituto di Astrofisica Spaziale e Fisica Cosmica Bologna, via Gobetti 101, 40129 Bologna, Italy 21 INAF–Istituto di Radioastronomia, via Gobetti 101, 40129 Bologna, Italy 22 INFN, Sezione di Roma Tre, via della Vasca Navale 84, 00146 Roma, Italy 23 INAF–Osservatorio Astronomico di Roma, via Frascati 33, 00040 Monte Porzio Catone (RM), Italy


The system reports:
```python
KeyError('There might be multiple affiliations per line or affiliation IDs might not be separated with commas or the affiliation is missing. Problematic author and affiliations',
         u'L. Guzzo1,2',
         [u'1', u'2'],
         {u'1': u'INAF-Osservatorio Astronomico di Brera, via Brera 28, 20122 Milano, via E. Bianchi 46, 23807 Merate, Italy 2 Universit\xe0 degli Studi di Milano, via G. Celoria 16, 20130 Milano, Italy 3 INAF - Istituto di Astrofisica Spaziale e Fisica Cosmica Milano, via Bassini 15, 20133 Milano, Italy 4 Aix-Marseille Universit\xe9, CNRS, LAM (Laboratoire d\u2019Astrophysique de Marseille) UMR 7326, 13388 Marseille, France 5 INAF-Osservatorio Astronomico di Torino, 10025 Pino Torinese, Italy 6 Canada-France-Hawaii Telescope, 65-1238 Mamalahoa Highway, Kamuela, HI 96743, USA 7 Aix-Marseille Universit\xe9, CNRS, CPT, UMR 7332, 13288 Marseille, France 8 INAF-Osservatorio Astronomico di Bologna, via Ranzani 1, 40127 Bologna, Italy 9 Dipartimento di Matematica e Fisica, Universit\xe0 degli Studi Roma Tre, via della Vasca Navale 84, 00146 Roma, Italy 10 Institute of Cosmology and Gravitation, Dennis Sciama Building, University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX, UK 11 INAF-Osservatorio Astronomico di Trieste, via G. B. Tiepolo 11, 34143 Trieste, Italy 12 Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ, UK 13 Institute of Physics, Jan Kochanowski University, ul. Swietokrzyska 15, 25-406 Kielce, Poland 14 Department of Particle and Astrophysical Science, Nagoya University, Furo-cho, Chikusa-ku, 464-8602 Nagoya, Japan 15 Dipartimento di Fisica e Astronomia, Alma Mater Studiorum Universit\xe0 di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy 16 INFN, Sezione di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy 17 Laboratoire Lagrange, UMR 7293, Universit\xe9 de Nice Sophia Antipolis, CNRS, Observatoire de la C\xf4te d\u2019Azur, 06300 Nice, France',
          u'18': u'Astronomical Observatory of the Jagiellonian University, Orla 171, 30-001 Cracow, Poland 19 National Centre for Nuclear Research, ul. Hoza 69, 00-681 Warszawa, Poland 20 INAF-Istituto di Astrofisica Spaziale e Fisica Cosmica Bologna, via Gobetti 101, 40129 Bologna, Italy 21 INAF-Istituto di Radioastronomia, via Gobetti 101, 40129 Bologna, Italy 22 INFN, Sezione di Roma Tre, via della Vasca Navale 84, 00146 Roma, Italy 23 INAF-Osservatorio Astronomico di Roma, via Frascati 33, 00040 Monte Porzio Catone (RM), Italy'})

(the error was retrieved via IPython)

@mathieugrives

michamos commented 6 years ago

@ksachs @fschwenn you were volunteering for maintaining the authorlist tool, IIRC?

michamos commented 6 years ago

Smaller test case that still exhibits the bug:

A. J. Hawken1, B. R. Granett1, A. Iovino1, L. Guzzo1,2

1 INAF–Osservatorio Astronomico di Brera, via Brera 28, 20122 Milano, via E. Bianchi 46, 23807 Merate, Italy 
2 Università degli Studi di Milano, via G. Celoria 16, 20130 Milano, Italy
michamos commented 6 years ago

Found it, it's failing on trailing whitespace in the affiliations.

BTW, it's also failing on trailing whitespace in the authors, but that's a different error:

J. Smith1 

1 University of somewhere

fails with ('Could not find affiliations',).

ksachs commented 6 years ago

first I have to see how to install a test version with not too much overhead. But yes - in principle its on my desk.

ksachs commented 6 years ago

wash_lines deletes the linebreak if there is a trailing space. So you have to clean it before.

I don't know whether the pull-request / commit message is done properly. You need to update 1 line of code.

jacquerie commented 6 years ago

I don't know whether the pull-request / commit message is done properly.

Feel free to skip that etiquette if you're planning to commit once in a while, we can fix those for you. The most important thing I did was to add the tests that @michamos provided above, so that we're sure that your PR fixes this issue.