scivision / linkchecker-markdown

Python asyncio + aiohttp Markdown *.md URL link checker: 10,000 files/second
MIT License
32 stars 18 forks source link

Unclosed socket warning #10

Open diegorondini opened 3 years ago

diegorondini commented 3 years ago

Sometimes when running linkchecker against a folder with markdown files I get a warning at the end of the execution: sys:1: ResourceWarning: unclosed <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('<ip-addr-of-local-pc>', 60350), raddr=('<ip-addr-of-remote-url>', 443)> ResourceWarning: Enable tracemalloc to get the object allocation traceback /usr/lib64/python3.8/asyncio/selector_events.py:696: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6> _warn(f"unclosed transport {self!r}", ResourceWarning, source=self) ResourceWarning: Enable tracemalloc to get the object allocation traceback

This happens with linkcheckmd 1.3.0 on Fedora 32.

scivision commented 3 years ago

which version of Python e.g. 3.7.5 or ? I can try with the same version. There were/are some Python versions that this may be a bug in Python itself for.

scivision commented 3 years ago

Oops, this is noted in aiohttp manual. I added a sleep just before loop close (not in loop) in bd50af6147937c9980651a95c8d8075c4eb0e546

diegorondini commented 3 years ago

Python version was 3.8, it was hidden in the log. By the way, you can probably revert the change when you'll move to aiohttp 4.0.0: https://github.com/aio-libs/aiohttp/issues/1925#issuecomment-715977247

Thanks you for the quick fix by the way!

scivision commented 3 years ago

Yes looks like it could be a while for 4.0, thanks for your research. Asyncio itself is undergoing enhancements in future Python releases that will possibly bring other changes to my asyncio-using packages as well.

diegorondini commented 3 years ago

Hi Michael,

this is probably low priority, as the issue is in the underlying library(ies), but I still can reproduce the issue on 1.3.1 with:

$ git clone https://gitlab.cern.ch/sft/lcgdocs.git
$ cd lcgdocs/
$ linkcheckMarkdown -r docs/

After several broken links you will get something along the lines of:

...
15.6 seconds to check links
sys:1: ResourceWarning: unclosed <socket.socket fd=15, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('<removed-ip-addr>', 38326), raddr=('188.184.9.234', 443)>
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/usr/lib64/python3.8/asyncio/selector_events.py:696: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=15>
  _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
sys:1: ResourceWarning: unclosed <socket.socket fd=13, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('<removed-ip-addr>', 57598), raddr=('188.184.20.224', 443)>
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/usr/lib64/python3.8/asyncio/selector_events.py:696: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=13>
  _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/usr/lib64/python3.8/asyncio/selector_events.py:696: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=19>
  _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/usr/lib64/python3.8/asyncio/selector_events.py:696: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6>
  _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback

If you're not getting these warnings on your system you can probably reproduce them using a Fedora 32 container.

Anyway, thanks a lot for your effort!

scivision commented 3 years ago

I wonder if git cloning this repo, and then increasing asyncio.sleep() to 1 second or something helps https://github.com/scivision/linkchecker-markdown/blob/259c8c8237a3d16cb30b23ab800aee007e73ab64/src/linkcheckmd/coro.py#L45

git clone https://github.com/scivision/linkchecker-markdown/
pip install -e linkchecker-markdown

the "-e" installs a live copy, where your edits are reflected on the next fresh import

diegorondini commented 3 years ago

Hi @scivision

I tried with 1 and 2 seconds, but I haven't seen any change, the problem persists.

Thank you