MatthieuBizien / roam-to-git

Automatic RoamResearch backup to Git
MIT License
549 stars 132 forks source link

Timeout on signin step #11

Closed jrk closed 3 years ago

jrk commented 4 years ago

When running for the first time (locally, with a pipx install on macOS 10.15.4), I consistently get pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded. seemingly in the first "Opening signin page" step (logs below).

This same setup, with the --debug flag, shows the Chromium GUI as expected and works with no long lags, right up through the end (where I believe it fails to actually unpack the zip files with a warning that this doesn't work in --debug mode).

2020-04-20 17:53:44.195 | INFO     | roam_to_git.__main__:main:43 - Loading secrets from /tmp/roamtest/.env
2020-04-20 17:53:44.198 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-04-20 17:53:44.329 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-04-20 17:53:44.899 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:95 - Configure downloads to /var/folders/_y/wm_9h7f555x60zdv6k4f74gr0000gn/T/tmp8nfgjem2
2020-04-20 17:53:44.911 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:95 - Configure downloads to /var/folders/_y/wm_9h7f555x60zdv6k4f74gr0000gn/T/tmpvev98ff8
2020-04-20 17:53:44.936 | DEBUG    | roam_to_git.scrapping:signin:184 - Opening signin page
2020-04-20 17:53:44.948 | DEBUG    | roam_to_git.scrapping:signin:184 - Opening signin page
2020-04-20 17:54:14.940 | DEBUG    | roam_to_git.scrapping:download_rr_archive:77 - Closing browser markdown
[E:pyppeteer.connection] connection unexpectedly closed
Task exception was never retrieved
future: <Task finished name='Task-60' coro=<Connection._async_send() done, defined at /Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py:69> exception=InvalidStateError('invalid state')>
Traceback (most recent call last):
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 827, in transfer_data
    message = await self.read_message()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 895, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 971, in read_data_frame
    frame = await self.read_frame(max_size)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 1047, in read_frame
    frame = await Frame.read(
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/framing.py", line 105, in read
    data = await reader(2)
  File "/usr/local/Cellar/python@3.8/3.8.2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/streams.py", line 721, in readexactly
    raise exceptions.IncompleteReadError(incomplete, n)
asyncio.exceptions.IncompleteReadError: 0 bytes read on a total of 2 expected bytes

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 73, in _async_send
    await self.connection.send(msg)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 555, in send
    await self.ensure_open()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 803, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: code = 1006 (connection closed abnormally [internal]), no reason

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 79, in _async_send
    await self.dispose()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 170, in dispose
    await self._on_close()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 151, in _on_close
    cb.set_exception(_rewriteError(
asyncio.exceptions.InvalidStateError: invalid state
2020-04-20 17:54:14.953 | DEBUG    | roam_to_git.scrapping:download_rr_archive:79 - Closed browser markdown
Traceback (most recent call last):
  File "/Users/jrk/.local/bin/roam-to-git", line 8, in <module>
    sys.exit(main())
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/__main__.py", line 69, in main
    scrap(markdown_zip_path, json_zip_path, config)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 253, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
  File "/usr/local/Cellar/python@3.8/3.8.2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config,
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 100, in _download_rr_archive
    await signin(document, config, sleep_duration=sleep_duration)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 185, in signin
    await document.goto('https://roamresearch.com/#/signin')
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/page.py", line 862, in goto
    raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.
2020-04-20 17:54:14.962 | DEBUG    | roam_to_git.scrapping:_kill_child_process:216 - Terminate child process [psutil.Process(pid=91694, name='Chromium', started='17:53:44'), psutil.Process(pid=91696, name='Chromium', started='17:53:44')]
MatthieuBizien commented 4 years ago

Very weird. Can you increase the timeout like in https://github.com/puppeteer/puppeteer/issues/782#issuecomment-329429325, so we can test if is only due to the time it takes to load Roam?

jrk commented 4 years ago

I added just {'timeout': 3000000} on the relevant call (the first document.goto in signin)—the other parameters on the page you link report as deprecated and unsupported. The result is it just locking up forever (or presumably for 3000 seconds—I have only waited about 10 minutes so far).

Notably, the Roam load time is not at all an issue when I enable debug mode and visually watch the progress (even just a few seconds before or after this always-repeatable failure in non-debug mode).

Is there any way to inspect the status in more detail, short of enabling the non-headless debug mode which seems to just magically work?

jrk commented 4 years ago

Followup: it works as expected on a relatively conventional Ubuntu 18.04 LTS machine. It fails in the same way on a different Mac on a different network (the same network as the Ubuntu machine).

Has this been tested on macOS?

jrk commented 4 years ago

For now I've worked around this by running it in Docker. (And since that container has no access to my ssh keys, I've also submitted a PR #13 to optionally skip just the final push step.)

MatthieuBizien commented 4 years ago

I develop it on my Mac ;) It's less reliable than on Linux, but it does work. Debugging is not easy, between heisenbug, async, not being able to see the gui, and it's hard to do tests.

jrk commented 4 years ago

I also found Puppeteer flakey and unreliable on macOS in the Node-based roam-backup, too, for what it’s worth. It was less consistently unable to get anywhere than this, but still obviously problematic.

MatthieuBizien commented 3 years ago

The backup errors should be fixed by #92 🍾.

@jrk Puppeteer is unreliable even on Linux... But Selenium looks much more robust!

I'm closing that issue, but feel free to re-open it if you get new errors.