Scholarly did not seem to have the ability to reduce the rate of page requests. Tools such as Publish or Perish limit this rate to prevent GS lockout, so this seemed a reasonable option to add. I added the ability to have it be a single number of seconds, or to pass a custom function that generated the number in seconds. The two could be combined, if one wanted to have a constant plus a little random padding from a function.
In addition, bibtex generation seemed to mistakenly produce a field called pub_year that needed to be year to meet proper bibtex format (.e.g., Zotero would not pick up the year with pub_year being the year field, but it does as year). There was no obvious benefit to this field being pub_year, and evidence a previous contributor meant for it to be year so I changed it to year in all cases.
Checklist
[x] Check that the base branch is set to develop and notmain.
[ ] Ensure that the documentation will be consistent with the code upon merging. (Should we document the delay feature?)
[ ] Add a line or a few lines that check the new features added. (I'm not sure how to do this.)
[x] Ensure that unit tests pass. (python -m unittest test_module.TestScraperAPI passes)
If you don't have a premium proxy, some of the tests will be skipped.
The tests that are run should pass without raising
MaxTriesExceededException or other exceptions.
Fixes: https://github.com/scholarly-python-package/scholarly/issues/431
Description
Scholarly did not seem to have the ability to reduce the rate of page requests. Tools such as Publish or Perish limit this rate to prevent GS lockout, so this seemed a reasonable option to add. I added the ability to have it be a single number of seconds, or to pass a custom function that generated the number in seconds. The two could be combined, if one wanted to have a constant plus a little random padding from a function.
In addition, bibtex generation seemed to mistakenly produce a field called
pub_year
that needed to beyear
to meet proper bibtex format (.e.g., Zotero would not pick up the year withpub_year
being the year field, but it does asyear
). There was no obvious benefit to this field beingpub_year
, and evidence a previous contributor meant for it to beyear
so I changed it toyear
in all cases.Checklist
develop
and notmain
.python -m unittest test_module.TestScraperAPI passes
) If you don't have a premium proxy, some of the tests will be skipped. The tests that are run should pass without raisingMaxTriesExceededException
or other exceptions.