scholarly-python-package / scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
https://scholarly.readthedocs.io/
The Unlicense
1.29k stars 292 forks source link

Unclosed file causes ResourceWarning while running unit tests #526

Open dlebedinsky opened 7 months ago

dlebedinsky commented 7 months ago

Describe the bug I fixed the test_bibtex unit test in my last PR #525. My only change was to the name of the input file. When I initially ran the unit test, it was working, but when I reran today, the test had a ResourceWarning and an error, because apparently a file is left open? There was also a similar ResourceWarning following the test_citedby_1k_citations, but that test did not give an error result.

To Reproduce Run python -m unittest -v test_module.py

Expected behavior I expected to see the following: test_bibtex (test_module.TestScholarlyWithProxy.test_bibtex) Test that we get the BiBTeX entry correctly ... ok test_citedby_1k_citations (test_module.TestScholarlyWithProxy.test_citedby_1k_citations) Test that scholarly can fetch 1000+ citations from an author ... ok

or at the very least, just: "Test that we get the BiBTeX entry correctly ... FAIL"

Stack trace test_bibtex (test_module.TestScholarlyWithProxy.test_bibtex) Test that we get the BiBTeX entry correctly ... /usr/lib64/python3.11/threading.py:101: ResourceWarning: unclosed file <_io.FileIO name='/dev/null' mode='wb' closefd=True> return _CRLock(*args, **kwargs) ResourceWarning: Enable tracemalloc to get the object allocation traceback /usr/lib/python3.11/site-packages/urllib3/connection.py:242: ResourceWarning: unclosed file <_io.FileIO name='/dev/null' mode='wb' closefd=True> if "user-agent" not in (six.ensure_str(k.lower()) for k in headers): ResourceWarning: Enable tracemalloc to get the object allocation traceback ERROR

test_citedby_1k_citations (test_module.TestScholarlyWithProxy.test_citedby_1k_citations) Test that scholarly can fetch 1000+ citations from an author ... :947: ResourceWarning: unclosed file <_io.FileIO name='/dev/null' mode='wb' closefd=True> ResourceWarning: Enable tracemalloc to get the object allocation traceback ok

====================================================================== ERROR: test_bibtex (test_module.TestScholarlyWithProxy.test_bibtex) Test that we get the BiBTeX entry correctly

Traceback (most recent call last): File "/home/daniel/repos/scholarly-dev/test_module.py", line 696, in test_bibtex pub = scholarly.search_single_pub("A distribution-based clustering algorithm for mining in large " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/daniel/repos/scholarly-dev/scholarly/_scholarly.py", line 182, in search_single_pub return self.__nav.search_publication(url, filled) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/daniel/repos/scholarly-dev/scholarly/_navigator.py", line 281, in search_publication soup = self._get_soup(url) ^^^^^^^^^^^^^^^^^^^ File "/home/daniel/repos/scholarly-dev/scholarly/_navigator.py", line 239, in _get_soup html = self._get_page('https://scholar.google.com{0}'.format(url)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/daniel/repos/scholarly-dev/scholarly/_navigator.py", line 190, in _get_page raise MaxTriesExceededException("Cannot Fetch from Google Scholar.") scholarly._proxy_generator.MaxTriesExceededException: Cannot Fetch from Google Scholar.

Desktop (please complete the following information):

Do you plan on contributing? Your response below will clarify whether the maintainers can expect you to fix the bug you reported.

Additional context I noticed that test_module.py automatically generated a CSV, in the same directory as itself. It is attached, I am not sure if it is intended behavior, or if it is relevant to this error. bsvrngkuqu.csv