mdaeron / bibcites

CLI to insert number of citations into BibTeX entries, using OpenCitations
Other
0 stars 0 forks source link

Error on first run #1

Closed japhir closed 2 years ago

japhir commented 2 years ago

Hi Mathieu,

I wanted to try the package out on my break on my entire bib database. I installed it via pip install (for the user) and ran it, but it didn't generate a result.

Hope it helps improve the package :).

Here's the trace:

> bibcites references.bib
Traceback (most recent call last):
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 107, in expand_string
    self.strings[name])
KeyError: 'aug'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/japhir/.local/bin/bibcites", line 8, in <module>
    sys.exit(cli())
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibcites/__init__.py", line 40, in cli
    db = bibtexparser.load(bibtex_file)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/__init__.py", line 69, in load
    return parser.parse_file(bibtex_file)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bparser.py", line 169, in parse_file
    return self.parse(file.read(), partial=partial)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bparser.py", line 147, in parse
    self._expr.parseFile(bibtex_file_obj)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibtexexpression.py", line 278, in parseFile
    return self.main_expression.parseFile(file_obj, parseAll=True)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 1859, in parse_file
    return self.parse_string(file_contents, parseAll)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 1097, in parse_string
    loc, tokens = self._parse(instring, 0)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 787, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 4695, in parseImpl
    return super().parseImpl(instring, loc, doActions)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 4603, in parseImpl
    loc, tokens = self_expr_parse(instring, loc, doActions)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 787, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 4003, in parseImpl
    return e._parse(
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 824, in _parseNoCache
    tokens = fn(instring, tokensStart, retTokens)
  File "/usr/lib/python3.10/site-packages/pyparsing/core.py", line 282, in wrapper
    ret = func(*args[limit:])
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bparser.py", line 187, in <lambda>
    lambda s, l, t: self._add_entry(
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bparser.py", line 277, in _add_entry
    d[self._clean_field_key(key)] = self._clean_val(fields[key])
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bparser.py", line 228, in _clean_val
    return as_text(val)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 270, in as_text
    return text_string_or_expression.get_value()
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 231, in get_value
    return ''.join([BibDataString.expand_string(s) for s in self.expr])
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 231, in <listcomp>
    return ''.join([BibDataString.expand_string(s) for s in self.expr])
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 197, in expand_string
    return string_or_bibdatastring.get_value()
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 178, in get_value
    return self._bibdatabase.expand_string(self.name)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibtexparser/bibdatabase.py", line 109, in expand_string
    raise(UndefinedString(name))
bibtexparser.bibdatabase.UndefinedString: 'aug'
mdaeron commented 2 years ago

I tried adding aug in the Month field of one of my bibitems, but could not reproduce your error. Could you please look for the offending entry (one containing the string 'aug', I assume) and copy it here?

japhir commented 2 years ago

Here's one:

@article{Meckler2014,
  title = {Long-Term Performance of the {{Kiel}} Carbonate Device with a New Correction Scheme for Clumped Isotope Measurements: {{Performance}} and Correction of {{Kiel}} Clumped Isotope Measurements},
  shorttitle = {Long-Term Performance of the {{Kiel}} Carbonate Device with a New Correction Scheme for Clumped Isotope Measurements},
  author = {Meckler, A. Nele and Ziegler, Martin and Mill{\'a}n, M. Isabel and Breitenbach, Sebastian F. M. and Bernasconi, Stefano M.},
  year = {2014},
  month = aug,
  journal = {Rapid Communications in Mass Spectrometry},
  volume = {28},
  number = {15},
  pages = {1705--1715},
  issn = {09514198},
  doi = {10.1002/rcm.6949},
  url = {http://doi.wiley.com/10.1002/rcm.6949},
  urldate = {2017-03-27},
  langid = {english},
  file = {/home/japhir/SurfDrive/bibliography/Meckler et al_2014_Long-term performance of the Kiel carbonate device with a new correction scheme.pdf}
}

If I remove the month = aug, line it does run correctly and adds a cites = {113}, line in this case.

japhir commented 2 years ago

Also, if I put aug in curly braces, it also works as expected. (month = {aug},)

japhir commented 2 years ago

This is the default output of better-bibtex export of Zotero, by the way.

mdaeron commented 2 years ago

It is related to this issue. I pushed a fix to the dev branch. Can you please test if that solves your issue?

japhir commented 2 years ago

I'd like to try it out but haven't figured out how to test out development branches from python. I've downloaded your __init__.py file and made it executable, but calling it from the commandline just gives me a bunch of warnings so I'm sure I'm doing something wrong here. If it works on your end with the example entry, it'll probably be fixed ;-).

Cheers,

Ilja

the messages I get when running the downloaded script ``` ./bibcites.py ./bibcites.py: 3: CLI to insert number of citations into BibTeX entries, using OpenCitations : not found ./bibcites.py: 5: __author__: not found ./bibcites.py: 6: __contact__: not found ./bibcites.py: 7: __copyright__: not found ./bibcites.py: 8: __license__: not found ./bibcites.py: 9: __date__: not found ./bibcites.py: 10: __version__: not found ^C ```
mdaeron commented 2 years ago

OK, I'll publish a new release to PIPy (v1.1.1) and close the issue. Don't hesitate to reopen it if the problem is not solved. Thanks.

japhir commented 2 years ago

Ok that definitely fixed that issue, but another one popped up immediately. I guess that's what happens when you apply it to a large bib file with 407 entries.

This one looks like it's just requesting the lookup ~too frequently~ of a long URL and will probably work on smaller bib files. Could work around it with a timer/timeout thing perhaps?

Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/japhir/.local/bin/bibcites", line 8, in <module>
    sys.exit(cli())
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/japhir/.local/lib/python3.10/site-packages/bibcites/__init__.py", line 65, in cli
    metadata = opencitingpy.client.Client().get_metadata([doi for doi in dbe])
  File "/home/japhir/.local/lib/python3.10/site-packages/opencitingpy/client.py", line 119, in get_metadata
    data = self.__get_data(operation, dois)
  File "/home/japhir/.local/lib/python3.10/site-packages/opencitingpy/client.py", line 30, in __get_data
    data = self.__make_request(uri)
  File "/home/japhir/.local/lib/python3.10/site-packages/opencitingpy/client.py", line 25, in __make_request
    data = request.json()
  File "/usr/lib/python3.10/site-packages/requests/models.py", line 917, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: [Errno Expecting value] <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>414 Request-URI Too Long</title>
</head><body>
<h1>Request-URI Too Long</h1>
<p>The requested URL's length exceeds the capacity
limit for this server.<br />
</p>
<hr>
<address>Apache/2.4.29 (Ubuntu) Server at w3id.org Port 443</address>
</body></html>
: 0
mdaeron commented 2 years ago

Pushed v1.2.0 to solve this issue (new -n option). Seems to work on my end.

japhir commented 2 years ago

I don't get any errors anymore, but it just hangs until I Control-C out of it. Also if I set it to e.g. -n 10

mdaeron commented 2 years ago

Using bibcites foo.bib -v -n 10 you can observe the time needed for each query to get a response. On my side, with -n 10, each query takes 13 seconds, so 407 entries should take ~9 minutes. If you check that this is what's happening using the -v option, does it print out something like below?

% bibcites mdaeron.bib -v -n10

Read 40 entries from mdaeron.bib.
Found 29 entries with a DOI.
Querying OpenCitations (1/3)...
Querying OpenCitations (2/3)...
Querying OpenCitations (3/3)...
Read 29 records from OpenCitations.
Found 58 citations for 10.1130/g21352.1.
Found 81 citations for 10.1016/j.epsl.2004.07.014.
...
Found 13 citations for 10.1029/2020GC009588.
Wrote 40 entries to mdaeron_withcites.bib.
japhir commented 2 years ago

Ah ok! Forgot to add the -v option yesterday. I did let it run for quite a while and checked if the process was still active (with htop) but it didn't finish. I just tried it again with the verbose option, and it seems to have worked now! :)