d-Rickyy-b / certstream-server-go

This project aims to be a drop-in replacement for the certstream server by Calidog. This tool aggregates, parses, and streams certificate data from multiple certificate transparency logs via websocket connections to the clients.
MIT License
88 stars 8 forks source link

Empty JSON data after server error #26

Closed Brandl closed 8 months ago

Brandl commented 10 months ago

Hello,

I migrated from Calidog Certstream today and so far your tool works pretty much seamlessly as a replacement. Still there is a bug I want to report, since my websocket-client hung up after around 15min due to this:

certstream-server-1  | 2023/12/08 13:57:00 ct-watcher.go:340: Processed 110000 entries | Queue length: 0
certstream-server-1  | E1208 13:57:02.711256       1 fetcher.go:292] https://sabre2024h1.ct.sectigo.com: GetRawEntries() failed: Get "https://sabre2024h1.ct.sectigo.com/ct/v1/get-entries?end=6778843&start=6778744": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
certstream-server-1  | 2023/12/08 13:57:06 ct-watcher.go:340: Processed 111000 entries | Queue length: 8

Which was happend shortly before my client quit due to:

certstream-server-1  | 2023/12/08 13:57:12 server.go:162: Stopping websocket for '172.20.0.8:53078' - /domains-only
certstream-server-1  | 2023/12/08 13:57:12 client.go:85: Disconnecting client 172.20.0.8:53078!
watchdog-1           | Traceback (most recent call last):
watchdog-1           |   File "/app/watch.py", line 41, in <module>
watchdog-1           |     asyncio.get_event_loop().run_until_complete(main(args))
watchdog-1           |   File "/usr/local/lib/python3.11/asyncio/base_events.py", line 653, in run_until_completewatchdog-1           |     return future.result()
watchdog-1           |            ^^^^^^^^^^^^^^^
watchdog-1           |   File "/app/watch.py", line 29, in main
watchdog-1           |     for domain in data["data"]:
watchdog-1           | TypeError: 'NoneType' object is not iterable

Not sure if there is a causal relationship between those two, either way, I'm now checking if the received data is None.

d-Rickyy-b commented 10 months ago

Hi, thanks for bringing this to my attention. This issue is weird. For now I am only able to see the correlation but not the causality. The first error indicates that one of the CT logs is no longer available. This happens every now and then. New CT logs appear, old ones get shut down, sometimes there's maintenance.

Even if the server read incomplete data from a CT log, it parses the data to an "entry":

https://github.com/d-Rickyy-b/certstream-server-go/blob/db684ca63313c352def2f0fabb399fcff43b70cc/internal/certificatetransparency/ct-watcher.go#L302-L308

This entry is then converted to json and afterwards sent to the clients. The resulting json should always contain the keys. The only case where this could in theory go wrong is here if subType is not within the given cases (which from the current codebase is not possible):

https://github.com/d-Rickyy-b/certstream-server-go/blob/1b14625be6a5200fb8867dbd758fad9fd4cdad16/internal/web/broadcastmanager.go#L80-L94

I'll add some error handling for this case, even though I don't think it's causing your issue. Can you please monitor if that happens regularly for you?

d-Rickyy-b commented 9 months ago

Hi there, did this issue happen to you again? I still can't make any sense of it. From the code I can't see any way that the server sends out empty messages. Maybe I can find it with more details.

Brandl commented 9 months ago

Since I needed to do more refactoring after all, I set the project aside for now, but I will look into it over christmas. Is there some debug switch, so I can assist with pinpointing the issue?

d-Rickyy-b commented 9 months ago

I appreciate your help. You could try logging the output from the websocket on the python side. I don't know your python code, so it's hard for me to tell where to add that. But if some variable is None, it would be cool to know what exactly arrived at the websocket at that time.

Apart from that there are no debug switches for the server.

cipisek9 commented 8 months ago

Hi, I noticed same problem. If it helps, I am attaching data. There is missing "all_domains".

{'data': {'cert_index': 421882148, 'cert_link': 'https://oak.ct.letsencrypt.org/2024h1/ct/v1/get-entries?start=421882148&end=421882148', 'leaf_cert': {'extensions': {'authorityInfoAccess': 'URI:http://pki.goog/repo/certs/gts1p5.der, URI:http://ocsp.pki.goog/s/gts1p5/ArOLpvMvk8o', 'authorityKeyIdentifier': 'keyid:d5:fc:9e:0d:df:1e:ca:dd:08:97:97:6e:2b:c5:5f:c5:2b:f5:ec:b8', 'basicConstraints': 'CA:FALSE', 'keyUsage': 'Digital Signature', 'subjectAltName': 'IP Address:34.117.169.92', 'subjectKeyIdentifier': 'keyid:ee:6f:d9:cd:4e:93:68:e6:30:32:72:c8:14:5d:cd:c0:c6:3d:0c:42'}, 'fingerprint': '05:69:F3:0F:9B:31:D9:50:28:4B:43:1E:F3:57:B0:24:D3:2F:0F:5F', 'sha1': '05:69:F3:0F:9B:31:D9:50:28:4B:43:1E:F3:57:B0:24:D3:2F:0F:5F', 'sha256': 'F1:98:CF:6F:F7:14:D9:A9:D9:DC:B4:42:67:E0:B4:27:55:B4:87:01:66:A3:28:D4:98:B8:65:5D:75:D1:96:73', 'not_after': 1705205724, 'not_before': 1705119325, 'serial_number': '3A167C4F54C74FCB11ED86B82282E7DB', 'signature_algorithm': 'sha256, rsa', 'subject': {'C': None, 'CN': '', 'L': None, 'O': None, 'OU': None, 'ST': None, 'aggregated': '/CN=', 'email_address': None}, 'issuer': {'C': 'US', 'CN': 'GTS CA 1P5', 'L': None, 'O': 'Google Trust Services LLC', 'OU': None, 'ST': None, 'aggregated': '/C=US/CN=GTS CA 1P5/O=Google Trust Services LLC', 'email_address': None}, 'is_ca': False}, 'seen': 1705395770.917, 'source': {'name': "Let's Encrypt 'Oak2024H1' log", 'url': 'https://oak.ct.letsencrypt.org/2024h1'}, 'update_type': 'PrecertLogEntry'}, 'message_type': 'certificate_update'}

d-Rickyy-b commented 8 months ago

If it helps, I am attaching data. There is missing "all_domains".

Yes! This indeed helps.

https://github.com/d-Rickyy-b/certstream-server-go/blob/9f52357ace7f2bc0b2aa40fddca0d3e05ce4809c/internal/certstream/models.go#L107

I'm using omitempty which simply skips this field if it is empty. That could lead to some error, because the field will not be included in the json. The question is: why is the field empty in the first place?

This needs some more testing and verification, but the all_domains is taken from the certificate-transparency library's "DNSNames" field:

https://github.com/d-Rickyy-b/certstream-server-go/blob/9f52357ace7f2bc0b2aa40fddca0d3e05ce4809c/internal/certificatetransparency/ct-parser.go#L111

In your provided example I can only see

'subjectAltName': 'IP Address:34.117.169.92'

which is obviously not a DNS name but an IP address, which is why it's not included in the cert.DNSNames and hence not in the all_domains field. I'll try to come up with a fix.

EDIT: OP used the /domains-only endpoint though, which uses another data type that should always include all_domains, so I am not sure if that's the whole issue:

https://github.com/d-Rickyy-b/certstream-server-go/blob/9f52357ace7f2bc0b2aa40fddca0d3e05ce4809c/internal/certstream/models.go#L146-L149

d-Rickyy-b commented 8 months ago

For debugging - the certificate got also logged on crt.sh:

https://crt.sh/?id=11728740156

cipisek9 commented 8 months ago

Perhaps a silly question, but what will happens if I am using endpoint /domains-only and from transparency log I get data without domain name - just like I sent before. I assume the message will be empty, so it will not be iterable... or am I wrong?

d-Rickyy-b commented 8 months ago

Perhaps a silly question

Nah, it's never silly when you don't know how a project's internals work.

what will happens if I am using endpoint /domains-only and from transparency log I get data without domain name - just like I sent before. I assume the message will be empty, so it will not be iterable... or am I wrong?

What should happen is this:

{
    "data": [],
    "message_type": "dns_entries"
}

Up until this point I thought that zero values for slices are just an empty slice. But that's not true. Zero values of slices are in fact nil. See https://go.dev/tour/moretypes/12.

To make 100% sure that this was the issue, I threw some code into the go playground to reproduce the issue: https://go.dev/play/p/wHgpHW-L7sT. And voilà, the result is:

{
    "data": null, 
    "message_type": "dns_entries"
}

So it only depends on how a slice was declared/initialized. https://go.dev/play/p/9iuAxCWhfuA

var s1 []string   // variable is only declared but not initialized, s1 is nil
s2 := []string{}  // variable is initialized to an empty string slice

Looking at this now, it seems so obvious. I'll implement a fix for this. Thank you @cipisek9!

cipisek9 commented 8 months ago

I'm glad that I could help :)