inspirehep / inspire-next

The INSPIRE repo.
https://inspirehep.net
GNU General Public License v3.0
59 stars 69 forks source link

There's some crazy escaping whet running the migrator task for the arXiv categories #2941

Closed david-caro closed 6 years ago

david-caro commented 6 years ago

When running the migrator task, it seems that at least sometimes the categories string gets escaped, casted to string (with repr) and escaped a few times, leading to something like:

{'arxiv_eprints': [{'categories': ['\'\\\'\\\\\\\'\\\\\\\\\\\\\\\'u"u\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'nucl-th\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'"\\\\\\\\\\\\\\\'\\\\\\\'\\\'\'', '\'\\\'\\\\\\\'\\\\\\\\\\\\\\\'u"u\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'hep-lat\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'"\\\\\\\\\\\\\\\'\\\\\\\'\\\'\'', '\'\\\'\\\\\\\'\\\\\\\\\\\\\\\'u"u\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'hep-ph\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'"\\\\\\\\\\\\\\\'\\\\\\\'\\\'\''], 'value': '1711.02797'}], 'document_type': ['article']}

The recid that was being migrated is the 1634982, you can see more in the sentry issue: https://sentry.inspirehep.net/inspire-sentry/prod/issues/62/

Expected Behavior

The strings should not be casted to string (see the u and quotes in them), and if so, there's some fancy extra escaping going around too.

Steps to Reproduce (for bugs)

  1. Try to migrate that record (inspirehep migrate --one 1634982 should work).

Context

Bug ninjing on labs

iulianav commented 6 years ago

I think the correct command might be inspirehep migrator one --recid 1634982

jacquerie commented 6 years ago

Sorry, can't reproduce! Neither locally (with the command @iulianav posted above), nor in production: https://labs.inspirehep.net/api/literature/1634982/db.

iulianav commented 6 years ago

Me neither! I tried with @ammirate many time!