CaliDog / certstream-python

Python library for connecting to CertStream
MIT License
427 stars 73 forks source link

Error connecting to CertStream - Connection is already closed #35

Open b00lpy opened 4 years ago

b00lpy commented 4 years ago

I implemented this tool, following the given examples. I found out that sometimes ( usally every 2 minutes, more or less) the connection drops and it takes almost a minute to reconnect to the stream.

I don't know why this happens and this could probably lead to missing some hits i'm looking for.

Here's an example of log ( integrated with python logging module )


[ERROR:certstream] 2020-09-02 16:20:31,002 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2020-09-02 16:20:36,097 - Connection established to CertStream! Listening for events...
[...]
[ERROR:certstream] 2020-09-02 16:23:10,796 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2020-09-02 16:23:15,908 - Connection established to CertStream! Listening for events...
[...]
[ERROR:certstream] 2020-09-02 16:25:11,015 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2020-09-02 16:25:16,182 - Connection established to CertStream! Listening for events...
[...]
[ERROR:certstream] 2020-09-02 16:27:17,051 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2020-09-02 16:27:22,194 - Connection established to CertStream! Listening for events...
[...]
[ERROR:certstream] 2020-09-02 16:29:01,343 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2020-09-02 16:29:06,522 - Connection established to CertStream! Listening for events...

Any suggestion about how to fix this?

rudraimmunefi commented 4 years ago

Facing the same issue again!

d-Rickyy-b commented 3 years ago

Hi. Might be related to https://github.com/CaliDog/certstream-server/issues/59

@Fitblip have you found the time to check this issue already? As far as I can see, your library is sending pings/keepalives over the websocket to indicate that it's still being used.

I haven't been able to find the reason behind the random disconnects yet.

Fitblip commented 3 years ago

Hmm interesting, are you folks using the certstream.calidog.io server, or is this happening with private servers as well? I wonder if you happen to be using an older version of the library (pinned in your requirements.txt) or something? As long as it's sending something to the server though it should be fine.

d-Rickyy-b commented 3 years ago

I for my part am using your certstream.calidog.io server - at least for development.

$ pip freeze
[...]
certstream==1.11
websocket-client==0.57.0
[...]

The error message is always "Connection is already closed.". There are basically two places where that exception is raised:

websocket/_app.py

    def send(self, data, opcode=ABNF.OPCODE_TEXT):
        [...]

        if not self.sock or self.sock.send(data, opcode) == 0:
            raise WebSocketConnectionClosedException(
                "Connection is already closed.")

websocket/_socket.py

def recv(sock, bufsize):
    if not sock:
        raise WebSocketConnectionClosedException("socket is already closed.")

    def _recv():
        [...]

    try:
        if sock.gettimeout() == 0:
            bytes_ = sock.recv(bufsize)
        else:
            bytes_ = _recv()
    except socket.timeout as e:
        [...]
    except SSLError as e:
        [...]

    if not bytes_:
        raise WebSocketConnectionClosedException(
            "Connection is already closed.")

For me the disconnects only appear within the latter (recv) call.

It reads 0 bytes grafik

This is the full stack trace: grafik

christophercutajar commented 3 years ago

I'm facing the same issue when running certstream --json --full >> certstream.json. I've installed certstream using pip installed certstream

[INFO:certstream] 2021-01-27 13:34:15,660 - Connection established to CertStream! Listening for events...
[ERROR:certstream] 2021-01-27 13:36:58,324 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/websocket/_app.py", line 306, in run_forever
    dispatcher.read(self.sock.sock, read, check)
  File "/usr/local/lib/python3.9/site-packages/websocket/_app.py", line 66, in read
    if not read_callback():
  File "/usr/local/lib/python3.9/site-packages/websocket/_app.py", line 272, in read
    op_code, frame = self.sock.recv_data_frame(True)
  File "/usr/local/lib/python3.9/site-packages/websocket/_core.py", line 344, in recv_data_frame
    frame = self.recv_frame()
  File "/usr/local/lib/python3.9/site-packages/websocket/_core.py", line 378, in recv_frame
    return self.frame_buffer.recv_frame()
  File "/usr/local/lib/python3.9/site-packages/websocket/_abnf.py", line 375, in recv_frame
    payload = self.recv_strict(length)
  File "/usr/local/lib/python3.9/site-packages/websocket/_abnf.py", line 396, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "/usr/local/lib/python3.9/site-packages/websocket/_core.py", line 453, in _recv
    return recv(self.sock, bufsize)
  File "/usr/local/lib/python3.9/site-packages/websocket/_socket.py", line 114, in recv
    raise WebSocketConnectionClosedException(
websocket._exceptions.WebSocketConnectionClosedException: Connection is already closed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/certstream", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/certstream/cli.py", line 64, in main
    certstream.listen_for_events(_handle_messages, args.url, skip_heartbeats=True)
  File "/usr/local/lib/python3.9/site-packages/certstream/core.py", line 54, in listen_for_events
    c.run_forever(ping_interval=15, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/websocket/_app.py", line 312, in run_forever
    teardown()
  File "/usr/local/lib/python3.9/site-packages/websocket/_app.py", line 231, in teardown
    if thread and thread.isAlive():
AttributeError: 'Thread' object has no attribute 'isAlive'
june5079 commented 3 years ago

I for my part am using your certstream.calidog.io server - at least for development.

$ pip freeze
[...]
certstream==1.11
websocket-client==0.57.0
[...]

The error message is always "Connection is already closed.". There are basically two places where that exception is raised:

websocket/_app.py

    def send(self, data, opcode=ABNF.OPCODE_TEXT):
        [...]

        if not self.sock or self.sock.send(data, opcode) == 0:
            raise WebSocketConnectionClosedException(
                "Connection is already closed.")

websocket/_socket.py

def recv(sock, bufsize):
    if not sock:
        raise WebSocketConnectionClosedException("socket is already closed.")

    def _recv():
        [...]

    try:
        if sock.gettimeout() == 0:
            bytes_ = sock.recv(bufsize)
        else:
            bytes_ = _recv()
    except socket.timeout as e:
        [...]
    except SSLError as e:
        [...]

    if not bytes_:
        raise WebSocketConnectionClosedException(
            "Connection is already closed.")

For me the disconnects only appear within the latter (recv) call.

It reads 0 bytes grafik

This is the full stack trace: grafik

I exactly have same issue with using calidog server.

certificate_update: 192647cert [13:19, 261.07cert/s]Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
certificate_update: 212013cert [14:58, 21.30cert/s]Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...

Error occurred every 1~2 minutes so that connection is not stable.

Hope this will be solved.

Chaz6 commented 3 years ago

I am having the same problem with 1.11 from pypi.

[ERROR:certstream] 2021-02-06 15:33:51,563 - Error connecting to CertStream - Connection is already closed. - Sleeping for a few seconds and trying again...
Traceback (most recent call last):
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_app.py", line 306, in run_forever
    dispatcher.read(self.sock.sock, read, check)
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_app.py", line 66, in read
    if not read_callback():
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_app.py", line 272, in read
    op_code, frame = self.sock.recv_data_frame(True)
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_core.py", line 344, in recv_data_frame
    frame = self.recv_frame()
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_core.py", line 378, in recv_frame
    return self.frame_buffer.recv_frame()
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_abnf.py", line 375, in recv_frame
    payload = self.recv_strict(length)
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_abnf.py", line 396, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_core.py", line 453, in _recv
    return recv(self.sock, bufsize)
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_socket.py", line 114, in recv
    raise WebSocketConnectionClosedException(
websocket._exceptions.WebSocketConnectionClosedException: Connection is already closed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/chaz/.local/bin/certstream", line 8, in <module>
    sys.exit(main())
  File "/home/chaz/.local/lib/python3.9/site-packages/certstream/cli.py", line 64, in main
    certstream.listen_for_events(_handle_messages, args.url, skip_heartbeats=True)
  File "/home/chaz/.local/lib/python3.9/site-packages/certstream/core.py", line 54, in listen_for_events
    c.run_forever(ping_interval=15, **kwargs)
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_app.py", line 312, in run_forever
    teardown()
  File "/home/chaz/.local/lib/python3.9/site-packages/websocket/_app.py", line 231, in teardown
    if thread and thread.isAlive():
AttributeError: 'Thread' object has no attribute 'isAlive'
Chaz6 commented 3 years ago

This is fixed upstream https://github.com/websocket-client/websocket-client/pull/594

d-Rickyy-b commented 3 years ago

@Chaz6 the linked PR does not fix the issue for me. I applied the patch locally but am still getting the same issue every two minutes or so.

Chaz6 commented 3 years ago

@Chaz6 the linked PR does not fix the issue for me. I applied the patch locally but am still getting the same issue every two minutes or so.

What version of Python do you have?

d-Rickyy-b commented 3 years ago

@Chaz6 the linked PR does not fix the issue for me. I applied the patch locally but am still getting the same issue every two minutes or so.

What version of Python do you have?

I am currently developing and testing on Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit (Intel)] on win32

Chaz6 commented 3 years ago

I am currently developing and testing on Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit (Intel)] on win32

OK, I am using python 3.9 so it is possible there is a different fix required.

kres20 commented 3 years ago

@Chaz6 @d-Rickyy-b I'm facing the same problem. I'm running certstream in the docker container on AWS ECS and python version is 3.9.5. I updated some packages, but still cannot resolve the issue.

root@8a07445128d9:/opt/certstream-documentdb# pip freeze certstream==1.12 pymongo==3.11.4 python-dateutil==2.8.1 PyYAML==5.4.1 schedule==1.1.0 six==1.16.0 termcolor==1.1.0 websocket-client==1.1.0

d-Rickyy-b commented 3 years ago

@Fitblip mind looking into this? This is "easy to reproduce" because it literally happens (reliably) every time I use certstream. Your help is much appreciated! Thanks in advance.

Chaz6 commented 3 years ago

Just as a point of reference, today I updated to certstream 1.12 (python 3.9.6 on Linux 64bit) and it works fine.

$ pip3 install -U --user certstream
Requirement already satisfied: certstream in /home/chaz/.local/lib/python3.9/site-packages (1.11)
Collecting certstream
  Downloading certstream-1.12.tar.gz (9.6 kB)
Collecting websocket-client>=0.58.0
  Using cached websocket_client-1.2.0-py2.py3-none-any.whl (52 kB)
Requirement already satisfied: termcolor in /home/chaz/.local/lib/python3.9/site-packages (from certstream) (1.1.0)
Building wheels for collected packages: certstream
  Building wheel for certstream (setup.py) ... done
  Created wheel for certstream: filename=certstream-1.12-py3-none-any.whl size=4731 sha256=6c8a70c8ba6c8510511ba16ff26432014fc95fc546223645cf302f31a89c7266
  Stored in directory: /home/chaz/.cache/pip/wheels/b9/d5/4b/157f4ada54a1b76c9509a7c00191014478d1039d390eda57c3
Successfully built certstream
Installing collected packages: websocket-client, certstream
  Attempting uninstall: websocket-client
    Found existing installation: websocket-client 0.57.0
    Uninstalling websocket-client-0.57.0:
      Successfully uninstalled websocket-client-0.57.0
  Attempting uninstall: certstream
    Found existing installation: certstream 1.11
    Uninstalling certstream-1.11:
      Successfully uninstalled certstream-1.11
Successfully installed certstream-1.12 websocket-client-1.2.0

$ certstream
[INFO:certstream] 2021-08-12 09:52:33,821 - Connection established to CertStream! Listening for events...
[2021-08-12T09:52:33.875015] ct.googleapis.com/testtube/ - www01.abn.ing.rabo.abn.accounts.betaling.berlingskemedia.net
[2021-08-12T09:52:33.876959] ct.googleapis.com/testtube/ - docker-test.aggeler.eu
[2021-08-12T09:52:33.878495] ct.googleapis.com/testtube/ - load.12-s.de
[2021-08-12T09:52:33.880872] ct.googleapis.com/testtube/ - og.blog.blog.magento.rainbow.idealnastrona.pl
[2021-08-12T09:52:33.882442] ct.googleapis.com/testtube/ - blog.blog.dev.sitemap.owlstown.net
[2021-08-12T09:52:33.883426] ct.googleapis.com/testtube/ - www.cheybeauty.org
[2021-08-12T09:52:33.884452] ct.googleapis.com/testtube/ - www.emilyasheyphoto.com
Fitblip commented 3 years ago

Hi folks, apologies for the delay in looking into this. I think there's a few things in this one issue to the point where it's over-loaded.

Install/packaging/isAlive issues

I'm able to use certstream as expected with docker and the lastest python, which leads me to believe this is not an issue with certstream, but a local issue:

$ docker pull python
Using default tag: latest
latest: Pulling from library/python
d272b0d8f7c4: Pull complete
81fabd82f026: Pull complete
0193c0cdceae: Pull complete
967a8f138596: Pull complete
bc1dbd38cda7: Pull complete
80a46c137b0c: Pull complete
f41525227e2c: Pull complete
1354066e326c: Pull complete
ffb5ac5ec588: Pull complete
Digest: sha256:44e7cbba204f9a81ec5bfa1b5de1de176b2590c01d55a02bd299551593517c7d
Status: Downloaded newer image for python:latest
docker.io/library/python:latest

$ docker run --rm -it python bash

root@1477dfc03834:/ $ pip install certstream
Collecting certstream
  Downloading certstream-1.12.tar.gz (9.6 kB)
Collecting websocket-client>=0.58.0
  Using cached websocket_client-1.2.0-py2.py3-none-any.whl (52 kB)
Collecting termcolor
  Using cached termcolor-1.1.0-py3-none-any.whl
Building wheels for collected packages: certstream
  Building wheel for certstream (setup.py) ... done
  Created wheel for certstream: filename=certstream-1.12-py3-none-any.whl size=4746 sha256=bf05687d2cfdbd224c5873cf0d6d229d6cd4176fa7cd5cd1ace19638ac45ee8c
  Stored in directory: /root/.cache/pip/wheels/b9/d5/4b/157f4ada54a1b76c9509a7c00191014478d1039d390eda57c3
Successfully built certstream
Installing collected packages: websocket-client, termcolor, certstream
Successfully installed certstream-1.12 termcolor-1.1.0 websocket-client-1.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.3; however, version 21.2.4 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

root@1477dfc03834:/ $ certstream
[INFO:certstream] 2021-08-12 20:17:57,734 - Connection established to CertStream! Listening for events...
[2021-08-12T20:17:57.799850] ct.googleapis.com/logs/argon2021/ - imap.sangse.com
[2021-08-12T20:17:57.801696] ct.googleapis.com/logs/argon2021/ - universe.manawac.com
[2021-08-12T20:17:57.803194] ct.googleapis.com/logs/argon2021/ - inspire.devmantra.uk
[2021-08-12T20:17:57.804575] ct.googleapis.com/logs/argon2021/ - mail.airlab.id
[2021-08-12T20:17:57.805963] ct.googleapis.com/logs/argon2021/ - staging.191.com.au

Disconnections

The certstream server is written in Elixir, using the excellent Cowboy http server. I believe when it doesn't receive any data from the client after a specific timeout (60s according to the docs), it assumes the client is dead and reaps the connection.

The fix for this is to have the client spawn a thread to send some data up to the server every 30s or so (a simple ws.send('.') will work), outlined in the websocket-client docs.

I'm not sure I have time to implement this but if someone wants to take a stab at it I'd happily look at a PR! :)

d-Rickyy-b commented 3 years ago

Thank you @Fitblip for your reply! I followed your steps:

root@server:~# docker run --rm -it python bash

root@755dee80470a:/# pip install certstream
Collecting certstream
  Downloading certstream-1.12.tar.gz (9.6 kB)
Collecting websocket-client>=0.58.0
  Using cached websocket_client-1.2.0-py2.py3-none-any.whl (52 kB)
Collecting termcolor
  Using cached termcolor-1.1.0-py3-none-any.whl
Building wheels for collected packages: certstream
  Building wheel for certstream (setup.py) ... - \ done
  Created wheel for certstream: filename=certstream-1.12-py3-none-any.whl size=4746 sha256=44a26af2b4f43e01bb24b32fdc267c5057832012c75988d239b7fb5c48d7aa30
  Stored in directory: /root/.cache/pip/wheels/b9/d5/4b/157f4ada54a1b76c9509a7c00191014478d1039d390eda57c3
Successfully built certstream
Installing collected packages: websocket-client, termcolor, certstream
Successfully installed certstream-1.12 termcolor-1.1.0 websocket-client-1.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

root@755dee80470a:/# certstream
[INFO:certstream] 2021-08-12 23:58:00,412 - Connection established to CertStream! Listening for events...
[2021-08-12T23:58:03.549530] ct.googleapis.com/skydiver/ - sbza7hkhrd6mmtk.westus2.atlas-test.cloudapp.azure.com 
[2021-08-12T23:58:03.551558] ct.googleapis.com/skydiver/ - wus2-fd497270b7.wus2.backend-instance-dev.azgrafana-test.io 
[2021-08-12T23:58:04.663397] ct.googleapis.com/logs/xenon2022/ - c8577f8edcf9.database.azure.com 
[...]
[2021-08-13T00:00:22.213534] nessie2022.ct.digicert.com/log/ - sni.cloudflaressl.com 
[2021-08-13T00:00:22.215210] nessie2022.ct.digicert.com/log/ - sni.cloudflaressl.com 
[ERROR:certstream] 2021-08-13 00:01:01,141 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-08-13 00:01:06,226 - Connection established to CertStream! Listening for events...
[2021-08-13T00:01:08.407135] ct.googleapis.com/logs/argon2022/ - sni.cloudflaressl.com 
[2021-08-13T00:01:08.408785] ct.googleapis.com/logs/argon2022/ - sni.cloudflaressl.com 

As you can see I still get the error Error connecting to CertStream - Connection to remote host was lost, which is another error to the previously mentioned Connection is already closed.

Also in my custom software where I use certstream as a library:

2021-08-13 02:03:04,394 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:03:05,534 - certstream - INFO - Connection established to CertStream! Listening for events...
2021-08-13 02:03:50,008 - certleak.core.certstreamwrapper - INFO - Processed 1486 updates (2881 unique domains) during the last 60 seconds. 48.0 domains/s, 24.8 certs/s
2021-08-13 02:03:50,008 - certleak.core.certstreamwrapper - INFO - Queue length: 1
2021-08-13 02:04:11,178 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:04:12,323 - certstream - INFO - Connection established to CertStream! Listening for events...
2021-08-13 02:04:50,032 - certleak.core.certstreamwrapper - INFO - Processed 1854 updates (2946 unique domains) during the last 60 seconds. 49.1 domains/s, 30.9 certs/s
2021-08-13 02:04:50,032 - certleak.core.certstreamwrapper - INFO - Queue length: 1
2021-08-13 02:05:05,796 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:05:06,899 - certstream - INFO - Connection established to CertStream! Listening for events...
2021-08-13 02:05:50,025 - certleak.core.certstreamwrapper - INFO - Processed 1857 updates (2790 unique domains) during the last 60 seconds. 46.5 domains/s, 30.9 certs/s
2021-08-13 02:05:50,025 - certleak.core.certstreamwrapper - INFO - Queue length: 1
2021-08-13 02:05:54,115 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:05:55,213 - certstream - INFO - Connection established to CertStream! Listening for events...
2021-08-13 02:06:38,937 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:06:40,167 - certstream - INFO - Connection established to CertStream! Listening for events...
2021-08-13 02:06:50,013 - certleak.core.certstreamwrapper - INFO - Processed 1721 updates (2789 unique domains) during the last 60 seconds. 46.5 domains/s, 28.7 certs/s
2021-08-13 02:06:50,013 - certleak.core.certstreamwrapper - INFO - Queue length: 1
2021-08-13 02:07:30,673 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:07:31,862 - certstream - INFO - Connection established to CertStream! Listening for events...
2021-08-13 02:07:50,027 - certleak.core.certstreamwrapper - INFO - Processed 1818 updates (2731 unique domains) during the last 60 seconds. 45.5 domains/s, 30.3 certs/s
2021-08-13 02:07:50,027 - certleak.core.certstreamwrapper - INFO - Queue length: 1
2021-08-13 02:08:22,420 - certstream - ERROR - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
2021-08-13 02:08:23,620 - certstream - INFO - Connection established to CertStream! Listening for events...

Is this the second part you addressed in your comment? So this could only be fixed with a regular ws.send('.')? If so, I'll try to implement this as soon as I find some time.

kres20 commented 3 years ago

Thank you @Fitblip @d-Rickyy-b. Even though I updated to certstream 1.12, I still get the same error. It is much appreciated if someone can implement above (ws.send('.')).

kres20 commented 3 years ago

@Fitblip @d-Rickyy-b I implemented ws.send in cli.py as following.

def _on_open(self, ws):
    certstream_logger.info("Connection established to CertStream! Listening for events...!")
    if self.on_open_handler:
        self.on_open_handler()
    t_ws_send = threading.Thread(target=self._ws_send, args=(ws,))
    t_ws_send.start()

def _ws_send(self, ws):
    while ws:
        try:
            ws.send('.')
            certstream_logger.info("Connection continue to CertStream! data sending...")
            time.sleep(10)
        except Exception as e:
            certstream_logger.error(e)
            break

But, still get the same error.

[INFO:certstream] 2021-08-23 08:26:48,303 - Connection established to CertStream! Listening for events...!
[INFO:certstream] 2021-08-23 08:26:48,304 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:26:58,314 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:27:08,320 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:27:18,330 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:27:28,341 - Connection continue to CertStream! data sending...
[ERROR:certstream] 2021-08-23 08:27:28,663 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-08-23 08:27:34,248 - Connection established to CertStream! Listening for events...!
[INFO:certstream] 2021-08-23 08:27:34,248 - Connection continue to CertStream! data sending...
[ERROR:certstream] 2021-08-23 08:27:38,351 - Connection is already closed.
[INFO:certstream] 2021-08-23 08:27:44,259 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:27:54,269 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:28:04,284 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:28:14,296 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:28:24,304 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:28:34,315 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:28:44,325 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:28:54,336 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:29:04,346 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:29:14,357 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:29:24,367 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:29:34,374 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:29:44,385 - Connection continue to CertStream! data sending...
[ERROR:certstream] 2021-08-23 08:29:44,948 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-08-23 08:29:50,540 - Connection established to CertStream! Listening for events...!
[INFO:certstream] 2021-08-23 08:29:50,540 - Connection continue to CertStream! data sending...
[ERROR:certstream] 2021-08-23 08:29:54,395 - Connection is already closed.
[INFO:certstream] 2021-08-23 08:30:00,551 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:30:10,561 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:30:20,572 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:30:30,580 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:30:40,591 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:30:50,601 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:31:00,612 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:31:10,617 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:31:20,627 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:31:30,638 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:31:40,648 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:31:50,658 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:32:00,669 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:32:10,679 - Connection continue to CertStream! data sending...
[INFO:certstream] 2021-08-23 08:32:20,690 - Connection continue to CertStream! data sending...
[ERROR:certstream] 2021-08-23 08:32:20,829 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-08-23 08:32:26,459 - Connection established to CertStream! Listening for events...!
[INFO:certstream] 2021-08-23 08:32:26,459 - Connection continue to CertStream! data sending...
[ERROR:certstream] 2021-08-23 08:32:30,700 - Connection is already closed.
[INFO:certstream] 2021-08-23 08:32:36,460 - Connection continue to CertStream! data sending...

I would like to prevent data loss caused by this error, do you have any ideas?

kres20 commented 3 years ago

@Fitblip @d-Rickyy-b I would like to know whether this disconnection happens only in my environment or not. Do you have any success cases (environments) that don't make this data loss error? or do you have any ideas to collect certificates data without data loss using other tools?

katesaikishore commented 3 years ago

I have changed the python version from 3.9.7 to 3.8.12 It working fine without the error

kres20 commented 3 years ago

@katesaikishore Thank you for your advice. I have changed the python to python:3.8.12-buster, but it still has the error.

ubuntu@ip-10-0-0-241:~/work/certstream_documentdb$ docker run 8ed9a1d5f1f8
[INFO:certstream] 2021-09-06 05:48:17,937 - Connection established to CertStream! Listening for events...
[ERROR:certstream] 2021-09-06 05:49:47,094 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-09-06 05:49:52,666 - Connection established to CertStream! Listening for events...
[ERROR:certstream] 2021-09-06 05:52:38,244 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-09-06 05:52:43,799 - Connection established to CertStream! Listening for events...
[ERROR:certstream] 2021-09-06 05:58:10,727 - Error connecting to CertStream - Connection to remote host was lost. - Sleeping for a few seconds and trying again...
[INFO:certstream] 2021-09-06 05:58:16,304 - Connection established to CertStream! Listening for events...

Could you show the other libraries versions? My environment is as following

certstream==1.12
pymongo==3.12.0
python-dateutil==2.8.2
PyYAML==5.4.1
schedule==1.1.0
six==1.16.0
termcolor==1.1.0
websocket-client==1.2.1
katesaikishore commented 3 years ago

certstram==1.11 websocket-client==0.57.0

And use python 3.8.12 Try to deploy it in heruko cloud. @kres20

kres20 commented 3 years ago

@katesaikishore Than you for your reply. We used the same versions of the libraries except for heroku (we have to use AWS), but it could not work well. I'll try it in local environment, not cloud. I'll also check the another certstream like Go, JAVA.

katesaikishore commented 3 years ago

@kres20 Oh, I have deployed it in heruko. Even though it will disconnect and reconnect itself. Something is really fishy.

kres20 commented 3 years ago

@katesaikishore I've just created the environment in locally (in VM) without docker, but it still has the error. I even used Go version, but still get the same error. @Fitblip Do you think that this problem is caused by server side (wss://certstream.calidog.io/)?

kres20 commented 3 years ago

@katesaikishore @Fitblip I set up the certstream server too (https://github.com/CaliDog/certstream-server), and connect there like certstream.listen_for_events(print_callback, url='ws://localhost:4000/'), and it works fine without any errors. Therefore this above error is caused by certstream.calidog.io.

mikenor commented 3 years ago

Argh, well this is disappointing. :-( I am hitting this problem too and can only reach the same conclusion as @kres20 that the certstream.calidog.io service is just basically unusable.

I noticed certstream.calidog.io is actually Cloudflare. In Wireshark packet capture on my client, I can see the Cloudflare frontend server is just suddenly closing the TCP connection for some reason. No shutting down of the Websocket session or the TLS session, but a (mostly) normal 4-way close handshake of the outer TCP connection. (Abnormally, the Cloudflare server waits for exactly 10 seconds before sending its very last ACK, despite my client dutifully attempting several retransmissions of its last FIN that the server is waiting for.)

When I add sslopt={'suppress_ragged_eofs': False} to the certstream.listen_for_events parameters, the issue causes an exception that matches what I saw in packet capture: ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2633).

Traceback (most recent call last):
  File "C:\Users\m\dev\watchcertstream\watchcertstream.py", line 72, in <module>
    main()
  File "C:\Users\m\dev\watchcertstream\watchcertstream.py", line 68, in main
    certstream.listen_for_events(certstream_message_callback, url='wss://certstream.calidog.io/', on_error=certstream_error_callback, sslopt={"suppress_ragged_eofs": False})
  File "C:\Program Files\Python39\lib\site-packages\certstream\core.py", line 54, in listen_for_events
    c.run_forever(ping_interval=15, **kwargs)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_app.py", line 371, in run_forever
    self._callback(self.on_error, e)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_app.py", line 412, in _callback
    self.on_error(self, e)
  File "C:\Program Files\Python39\lib\site-packages\certstream\core.py", line 47, in _on_error
    self.on_error_handler(ex)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_app.py", line 407, in _callback
    callback(self, *args)
  File "C:\Program Files\Python39\lib\site-packages\certstream\core.py", line 47, in _on_error
    self.on_error_handler(ex)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_app.py", line 369, in run_forever
    dispatcher.read(self.sock.sock, read, check)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_app.py", line 70, in read
    if not read_callback():
  File "C:\Program Files\Python39\lib\site-packages\websocket\_app.py", line 335, in read
    op_code, frame = self.sock.recv_data_frame(True)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_core.py", line 396, in recv_data_frame
    frame = self.recv_frame()
  File "C:\Program Files\Python39\lib\site-packages\websocket\_core.py", line 435, in recv_frame
    return self.frame_buffer.recv_frame()
  File "C:\Program Files\Python39\lib\site-packages\websocket\_abnf.py", line 351, in recv_frame
    payload = self.recv_strict(length)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_abnf.py", line 372, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "C:\Program Files\Python39\lib\site-packages\websocket\_core.py", line 519, in _recv
    return recv(self.sock, bufsize)
  File "C:\Program Files\Python39\lib\site-packages\websocket\_socket.py", line 113, in recv
    bytes_ = _recv()
  File "C:\Program Files\Python39\lib\site-packages\websocket\_socket.py", line 90, in _recv
    return sock.recv(bufsize)
  File "C:\Program Files\Python39\lib\ssl.py", line 1226, in recv
    return self.read(buflen)
  File "C:\Program Files\Python39\lib\ssl.py", line 1101, in read
    return self._sslobj.read(len)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2633)

When I use my preferred public DNS resolver (149.112.121.10), certstream.calidog.io resolves to 172.64.80.1 for me. This is a Cloudflare IP. Traceroute from my client and from a couple other places suggests that it's an anycast IP that exists at/near multiple different Internet exchanges.

When I use other public DNS resolvers, certstream.calidog.io resolves to different Cloudflare IPs for me - 172.67.151.243 and 104.21.40.132. When I force my client to connect to these IPs, the problem still happens, but anecdotally seems to occur far less frequently than with 172.64.80.1.

I'm skeptical that IP anycast is a suitable CDN strategy for services requiring permanent TCP connections, such as the Calidog Certstream service. IMO this problem is likely a completely normal side-effect of the chosen CDN strategy that the Calidog service is using. :-(

Fitblip commented 3 years ago

@mikenor Yep things are currently hosted behind Cloudflare, and I agree it's not ideal given the CDN nature of cloudflare and the persistent connections for the operation of certstream. Ultimately https://certstream.calidog.io is a service to try out certstream but if you need production grade quality (for analytics/etc) then you should absolutely run your own server. I chose Cloudflare because it was a way for me to spin up a SSL fronted service without much headache and to cut down on potential botspam.

For the public instance we also currently fully saturate a gigabit link (https://twitter.com/fitblip/status/1203058921583398914) and pump about a petabyte of traffic into the internet every 3 months, so inevitably that means that certs are dropped and you shouldn't rely on the service to see 100% of everything.

Ultimately I'm not interested in spending more time or money per month to host this in a way where it's a rock solid production quality service (certstream.calidog.io, the server software should be solid), but if someone wanted to donate some servers and/or run the service and run it as a cluster with multi-gigabit links then I'm absolutely game to point the DNS records at it and help with the effort!