pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
407 stars 127 forks source link

Classification group - invalid literal #258

Closed raffaem closed 2 years ago

raffaem commented 2 years ago
from pybliometrics.scopus import AuthorRetrieval
scopus_id = "12792377500"
au = AuthorRetrieval(scopus_id, refresh=True)
print(au.classificationgroup)
$ python3 main.py 
Traceback (most recent call last):
  File "/mnt/dataint/data/progetti_miei/python/pybliometrics_bug/main.py", line 9, in <module>
    print(au.classificationgroup)
  File "/home/raffaele/.local/lib/python3.10/site-packages/pybliometrics/scopus/author_retrieval.py", line 66, in classificationgroup
    out = [(int(item['$']), int(item['@frequency'])) for item in
  File "/home/raffaele/.local/lib/python3.10/site-packages/pybliometrics/scopus/author_retrieval.py", line 66, in <listcomp>
    out = [(int(item['$']), int(item['@frequency'])) for item in
ValueError: invalid literal for int() with base 10: '2800;
raffaem commented 2 years ago

Both

chained_get(self._profile, path, [])) and listify(chained_get(self._profile, path, []))) are list of dicts, so I wonder what the listify does here.

What do you think is the best way to fix this?

Deleting the ; seems impractical and too slow.

We could have a try ... except that does something more complicated in case of failure

Michael-E-Rose commented 2 years ago

The listify() assures we always receive a list here. If chained_get(self._profile, path, [])) returns just single dictionary, as is the case for authors who work in just one subject area, listify() safely turns this into a list. I've also seen cases where two dictionaries were just bind together.

You caught a really weird mistake in the Scopus database. It's of the sort of "how can we make the life of users really complicated?"

Your solution is good but too complex. I'll respond there.

PS: It's not necessary to first create an issue and immediately a PR. You might as well create the PR and explain its reason right there. Might save you time and it's less complex.

raffaem commented 2 years ago

how do I post the code sample and the error message in the PR?

in a comment?

Michael-E-Rose commented 2 years ago

Yes. It behaves like an issue