prbinu / tls-scan

An Internet scale, blazing fast SSL/TLS scanner ( non-blocking, event-driven )
https://prbinu.github.io/tls-scan
Other
283 stars 54 forks source link

Issues in parsing the results #13

Closed ealashwali closed 5 years ago

ealashwali commented 6 years ago

Hi. I have conducted the scan for 1M domains. The results are saved in .json file. I added all the objects in an array and separated the objects by comma. Unfortunately, conventional python parsers could not load the array seemingly due to memory. I tried to install jq-linux64 but Ubuntu can't get it:

sudo apt-get install jq-linux64 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package jq-linux64

Can you please advise me on how to parse the json file with 2.4 GB? I also tried jq but it kept pointing errors in the file and failed.

I probably should have made the filtering during the scan. But I can't find the tool jq-linux64. If I want to filter during the scan using jq can you write a simple example syntax with jq for example to get host, ip, port?

jq can only parse array, with objects separated by comma. How to parse tls-scan output from the command line using jq?

prbinu commented 6 years ago

You can download jq from: https://stedolan.github.io/jq/

Can you share few lines of your json output? Also the command line flags you used.

ealashwali commented 6 years ago

The one you pointed jq is not the same as the jq-linux64 that you refer to it in the tutorial. Here is a sample output. Can you please provide a way to parse say these values: "host", "ip" in comma separated format: facebook,157.240.1.35 then in a newline:reddit.com,151.101.65.140 Here is the first three records. I wish I do not have to re-scan again. I wish if there is a possible way to deal with 2.4 GB json file and parse it. I tried several methods and techniques. Can you please help me parse the output?

{
  "host": "facebook.com",
  "ip": "157.240.1.35",
  "port": 443,
  "cipher": "ECDHE-ECDSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESGCM(128) Mac=AEAD",
  "tempPublicKeyAlg": "ECDH prime256v1",
  "tempPublicKeySize": 256,
  "secureRenego": true,
  "compression": "NONE",
  "expansion": "NONE",
  "sessionLifetimeHint": 172800,
  "x509ChainDepth": 2,
  "verifyCertResult": true,
  "verifyHostResult": true,
  "ocspStapled": false,
  "certificateChain": [
  {
    "version": 3,
    "subject": "CN=*.facebook.com; O=Facebook, Inc.; L=Menlo Park; ST=California; C=US",
    "issuer": "CN=DigiCert SHA2 High Assurance Server CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "subjectCN": "*.facebook.com",
    "subjectAltName": "DNS:*.facebook.com, DNS:*.xx.fbcdn.net, DNS:*.fbsbx.com, DNS:*.xz.fbcdn.net, DNS:*.facebook.net, DNS:*.xy.fbcdn.net, DNS:*.messenger.com, DNS:fb.com, DNS:*.fbcdn.net, DNS:*.fb.com, DNS:*.m.facebook.com, DNS:messenger.com, DNS:facebook.com",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Dec 15 00:00:00 2017 GMT",
    "notAfter": "Mar 22 12:00:00 2019 GMT",
    "expired": false,
    "serialNo": "0B:3C:3B:60:1A:18:F5:9E:E2:B6:BB:05:60:5E:F2:C0",
    "keyUsage": "Digital Signature critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "ECC prime256v1",
    "publicKeySize": 256,
    "basicConstraints": "CA:FALSE critical",
    "subjectKeyIdentifier": "C0:FD:74:F5:7D:CB:C6:27:F1:03:D3:62:A2:45:D7:84:1C:15:21:08",
    "sha1Fingerprint": "BD:25:8C:1F:62:A4:A6:D9:CF:7D:98:12:D2:2E:2F:F5:7E:84:FB:36"
  },  {
    "version": 3,
    "subject": "CN=DigiCert SHA2 High Assurance Server CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "issuer": "CN=DigiCert High Assurance EV Root CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "subjectCN": "DigiCert SHA2 High Assurance Server CA",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Oct 22 12:00:00 2013 GMT",
    "notAfter": "Oct 22 12:00:00 2028 GMT",
    "expired": false,
    "serialNo": "04:E1:E7:A4:DC:5C:F2:F3:6D:C0:2B:42:B8:5D:15:9F",
    "keyUsage": "Digital Signature, Certificate Sign, CRL Sign critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:TRUE, pathlen:0 critical",
    "subjectKeyIdentifier": "51:68:FF:90:AF:02:07:75:3C:CC:D9:65:64:62:A2:12:B8:59:72:3B",
    "sha1Fingerprint": "A0:31:C4:67:82:E6:E6:C6:62:C2:C8:7C:76:DA:9A:A6:2C:CA:BD:8E"
  } ]
}
{
  "host": "reddit.com",
  "ip": "151.101.65.140",
  "port": 443,
  "cipher": "ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AESGCM(128) Mac=AEAD",
  "tempPublicKeyAlg": "X25519",
  "tempPublicKeySize": 253,
  "secureRenego": true,
  "compression": "NONE",
  "expansion": "NONE",
  "sessionLifetimeHint": 7200,
  "x509ChainDepth": 2,
  "verifyCertResult": true,
  "verifyHostResult": true,
  "ocspStapled": true,
  "verifyOcspResult": true,
  "certificateChain": [
  {
    "version": 3,
    "subject": "CN=*.reddit.com; O=Reddit Inc.; L=San Francisco; ST=California; C=US",
    "issuer": "CN=DigiCert SHA2 Secure Server CA; O=DigiCert Inc; C=US",
    "subjectCN": "*.reddit.com",
    "subjectAltName": "DNS:*.reddit.com, DNS:reddit.com, DNS:*.redditmedia.com, DNS:redditmedia.com, DNS:*.redd.it, DNS:redd.it, DNS:www.redditstatic.com, DNS:i.reddituploads.com, DNS:*.thumbs.redditmedia.com, DNS:www.redditinc.com, DNS:redditinc.com",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Aug 17 00:00:00 2018 GMT",
    "notAfter": "Sep  2 12:00:00 2020 GMT",
    "expired": false,
    "serialNo": "07:5B:02:DF:9D:A4:16:51:2F:64:CE:70:71:FC:8C:07",
    "keyUsage": "Digital Signature, Key Encipherment critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:FALSE critical",
    "subjectKeyIdentifier": "71:E0:50:D1:E7:80:52:FB:23:14:65:9D:43:A7:8D:31:AA:56:69:26",
    "sha1Fingerprint": "E3:C0:F1:CF:CB:A4:61:09:02:1A:74:06:71:83:CD:A8:59:28:B4:0D"
  },  {
    "version": 3,
    "subject": "CN=DigiCert SHA2 Secure Server CA; O=DigiCert Inc; C=US",
    "issuer": "CN=DigiCert Global Root CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "subjectCN": "DigiCert SHA2 Secure Server CA",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Mar  8 12:00:00 2013 GMT",
    "notAfter": "Mar  8 12:00:00 2023 GMT",
    "expired": false,
    "serialNo": "01:FD:A3:EB:6E:CA:75:C8:88:43:8B:72:4B:CF:BC:91",
    "keyUsage": "Digital Signature, Certificate Sign, CRL Sign critical",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:TRUE, pathlen:0 critical",
    "subjectKeyIdentifier": "0F:80:61:1C:82:31:61:D5:2F:28:E7:8D:46:38:B4:2C:E1:C6:D9:E2",
    "sha1Fingerprint": "1F:B8:6B:11:68:EC:74:31:54:06:2E:8C:9C:C5:B1:71:A4:B7:CC:B4"
  } ]
}
{
  "host": "google.com",
  "ip": "216.58.206.46",
  "port": 443,
  "cipher": "ECDHE-ECDSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=ECDSA Enc=AESGCM(128) Mac=AEAD",
  "tempPublicKeyAlg": "X25519",
  "tempPublicKeySize": 253,
  "secureRenego": true,
  "compression": "NONE",
  "expansion": "NONE",
  "sessionLifetimeHint": 100800,
  "x509ChainDepth": 2,
  "verifyCertResult": true,
  "verifyHostResult": true,
  "ocspStapled": false,
  "certificateChain": [
  {
    "version": 3,
    "subject": "CN=*.google.com; O=Google LLC; L=Mountain View; ST=California; C=US",
    "issuer": "CN=Google Internet Authority G3; O=Google Trust Services; C=US",
    "subjectCN": "*.google.com",
    "subjectAltName": "DNS:*.google.com, DNS:*.android.com, DNS:*.appengine.google.com, DNS:*.cloud.google.com, DNS:*.db833953.google.cn, DNS:*.g.co, DNS:*.gcp.gvt2.com, DNS:*.google-analytics.com, DNS:*.google.ca, DNS:*.google.cl, DNS:*.google.co.in, DNS:*.google.co.jp, DNS:*.google.co.uk, DNS:*.google.com.ar, DNS:*.google.com.au, DNS:*.google.com.br, DNS:*.google.com.co, DNS:*.google.com.mx, DNS:*.google.com.tr, DNS:*.google.com.vn, DNS:*.google.de, DNS:*.google.es, DNS:*.google.fr, DNS:*.google.hu, DNS:*.google.it, DNS:*.google.nl, DNS:*.google.pl, DNS:*.google.pt, DNS:*.googleadapis.com, DNS:*.googleapis.cn, DNS:*.googlecommerce.com, DNS:*.googlevideo.com, DNS:*.gstatic.cn, DNS:*.gstatic.com, DNS:*.gstaticcnapps.cn, DNS:*.gvt1.com, DNS:*.gvt2.com, DNS:*.metric.gstatic.com, DNS:*.urchin.com, DNS:*.url.google.com, DNS:*.youtube-nocookie.com, DNS:*.youtube.com, DNS:*.youtubeeducation.com, DNS:*.yt.be, DNS:*.ytimg.com, DNS:android.clients.google.com, DNS:android.com, DNS:developer.android.google.cn, DNS:developers.android.google.cn, DNS:g.co, DNS:goo.gl, DNS:google-analytics.com, DNS:google.com, DNS:googlecommerce.com, DNS:source.android.google.cn, DNS:urchin.com, DNS:www.goo.gl, DNS:youtu.be, DNS:youtube.com, DNS:youtubeeducation.com, DNS:yt.be",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Aug  7 18:31:57 2018 GMT",
    "notAfter": "Oct 16 18:28:00 2018 GMT",
    "expired": false,
    "serialNo": "2E:D1:A7:71:10:1B:4C:E8",
    "keyUsage": "Digital Signature critical",
    "extKeyUsage": "TLS Web Server Authentication",
    "publicKeyAlg": "ECC prime256v1",
    "publicKeySize": 256,
    "basicConstraints": "CA:FALSE critical",
    "subjectKeyIdentifier": "8E:12:3E:B2:05:91:A7:C1:D7:EC:D8:86:60:46:1C:63:27:6F:91:91",
    "sha1Fingerprint": "76:FB:50:5F:7C:81:7D:89:6B:42:14:24:43:DE:86:E7:3C:D9:85:5F"
  },  {
    "version": 3,
    "subject": "CN=Google Internet Authority G3; O=Google Trust Services; C=US",
    "issuer": "CN=GlobalSign; O=GlobalSign; OU=GlobalSign Root CA - R2",
    "subjectCN": "Google Internet Authority G3",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Jun 15 00:00:42 2017 GMT",
    "notAfter": "Dec 15 00:00:42 2021 GMT",
    "expired": false,
    "serialNo": "01:E3:A9:30:1C:FC:72:06:38:3F:9A:53:1D",
    "keyUsage": "Digital Signature, Certificate Sign, CRL Sign critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:TRUE, pathlen:0 critical",
    "subjectKeyIdentifier": "77:C2:B8:50:9A:67:76:76:B1:2D:C2:86:D0:83:A0:7E:A6:7E:BA:4B",
    "sha1Fingerprint": "EE:AC:BD:0C:B4:52:81:95:77:91:1E:1E:62:03:DB:26:2F:84:A3:18"
  } ]
}
{
  "host": "google.com.br",
  "ip": "216.58.206.131",
  "port": 443,
  "cipher": "ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AESGCM(128) Mac=AEAD",
  "tempPublicKeyAlg": "X25519",
  "tempPublicKeySize": 253,
  "secureRenego": true,
  "compression": "NONE",
  "expansion": "NONE",
  "sessionLifetimeHint": 100800,
  "x509ChainDepth": 2,
  "verifyCertResult": true,
  "verifyHostResult": true,
  "ocspStapled": false,
  "certificateChain": [
  {
    "version": 3,
    "subject": "CN=*.google.com.br; O=Google LLC; L=Mountain View; ST=California; C=US",
    "issuer": "CN=Google Internet Authority G3; O=Google Trust Services; C=US",
    "subjectCN": "*.google.com.br",
    "subjectAltName": "DNS:*.google.com.br, DNS:google.com.br",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Aug  7 18:34:03 2018 GMT",
    "notAfter": "Oct 16 18:28:00 2018 GMT",
    "expired": false,
    "serialNo": "6C:06:79:1B:AA:6D:C2:2C",
    "extKeyUsage": "TLS Web Server Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:FALSE critical",
    "subjectKeyIdentifier": "51:8F:79:25:5E:29:C7:2A:6C:D8:B6:10:9B:8C:A6:D6:76:01:C8:AB",
    "sha1Fingerprint": "31:E3:E4:77:75:81:2C:F1:53:0C:4A:85:75:DC:6D:85:0A:04:FB:25"
  },  {
    "version": 3,
    "subject": "CN=Google Internet Authority G3; O=Google Trust Services; C=US",
    "issuer": "CN=GlobalSign; O=GlobalSign; OU=GlobalSign Root CA - R2",
    "subjectCN": "Google Internet Authority G3",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Jun 15 00:00:42 2017 GMT",
    "notAfter": "Dec 15 00:00:42 2021 GMT",
    "expired": false,
    "serialNo": "01:E3:A9:30:1C:FC:72:06:38:3F:9A:53:1D",
    "keyUsage": "Digital Signature, Certificate Sign, CRL Sign critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:TRUE, pathlen:0 critical",
    "subjectKeyIdentifier": "77:C2:B8:50:9A:67:76:76:B1:2D:C2:86:D0:83:A0:7E:A6:7E:BA:4B",
    "sha1Fingerprint": "EE:AC:BD:0C:B4:52:81:95:77:91:1E:1E:62:03:DB:26:2F:84:A3:18"
  } ]
}
{
  "host": "twitter.com",
  "ip": "104.244.42.193",
  "port": 443,
  "cipher": "ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 Kx=ECDH     Au=RSA  Enc=AESGCM(128) Mac=AEAD",
  "tempPublicKeyAlg": "ECDH prime256v1",
  "tempPublicKeySize": 256,
  "secureRenego": true,
  "compression": "NONE",
  "expansion": "NONE",
  "sessionLifetimeHint": 129600,
  "x509ChainDepth": 2,
  "verifyCertResult": true,
  "verifyHostResult": true,
  "ocspStapled": false,
  "certificateChain": [
  {
    "version": 3,
    "subject": "CN=twitter.com; OU=tsa_f Point of Presence; O=Twitter, Inc.; L=San Francisco; ST=California; C=US; postalCode=94103; street=1355 Market St; street=Suite 900; serialNumber=4337446; jurisdictionST=Delaware; jurisdictionC=US; businessCategory=Private Organization",
    "issuer": "CN=DigiCert SHA2 Extended Validation Server CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "subjectCN": "twitter.com",
    "subjectAltName": "DNS:twitter.com, DNS:www.twitter.com",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Jan 12 00:00:00 2017 GMT",
    "notAfter": "Jan 17 12:00:00 2019 GMT",
    "expired": false,
    "serialNo": "0F:C7:47:E9:78:EA:D0:8F:D9:B9:E5:7F:5F:5B:A9:1E",
    "keyUsage": "Digital Signature, Key Encipherment critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:FALSE critical",
    "subjectKeyIdentifier": "53:4B:48:D5:47:3D:D5:A0:97:C8:6B:48:C2:02:9F:94:C4:D5:E9:02",
    "sha1Fingerprint": "D1:D1:93:3E:21:98:81:20:2F:69:FA:FC:A8:98:BC:EB:3C:61:20:39"
  },  {
    "version": 3,
    "subject": "CN=DigiCert SHA2 Extended Validation Server CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "issuer": "CN=DigiCert High Assurance EV Root CA; OU=www.digicert.com; O=DigiCert Inc; C=US",
    "subjectCN": "DigiCert SHA2 Extended Validation Server CA",
    "signatureAlg": "sha256WithRSAEncryption",
    "notBefore": "Oct 22 12:00:00 2013 GMT",
    "notAfter": "Oct 22 12:00:00 2028 GMT",
    "expired": false,
    "serialNo": "0C:79:A9:44:B0:8C:11:95:20:92:61:5F:E2:6B:1D:83",
    "keyUsage": "Digital Signature, Certificate Sign, CRL Sign critical",
    "extKeyUsage": "TLS Web Server Authentication, TLS Web Client Authentication",
    "publicKeyAlg": "RSA",
    "publicKeySize": 2048,
    "basicConstraints": "CA:TRUE, pathlen:0 critical",
    "subjectKeyIdentifier": "3D:D3:50:A5:D6:A0:AD:EE:F3:4A:60:0A:65:D3:21:D4:F8:F8:D6:0F",
    "sha1Fingerprint": "7E:2F:3A:4F:8F:E8:FA:8A:57:30:AE:CA:02:96:96:63:7E:98:6F:3F"
  } ]
}
ealashwali commented 6 years ago

Is there any safe automatic way to remove the "certificateChain": and its array from the json file? I think some certs are causing parsers to break. I want to remove this element as I do not need it.

prbinu commented 6 years ago

Here is one sample:

echo -e "yahoo.com\ngoogle.com\nfacebook.com" | ./tls-scan --cacert ../etc/tls-scan/ca-bundle.crt 2>/dev/null | jq -r '[.host, .ip] | @tsv'
yahoo.com   98.138.219.231
google.com  172.217.0.46
facebook.com    157.240.22.35

BTW, when you scan large number of servers, avoid --pretty flag.

prbinu commented 6 years ago

if you don't want certificateChain output, then you need to comment code between the following lines (inclusive) and rebuild it. https://github.com/prbinu/tls-scan/blob/master/cert-parser.c#L681 https://github.com/prbinu/tls-scan/blob/master/cert-parser.c#L798

ealashwali commented 6 years ago

Thanks. I appreciate your responses. For the output,

yahoo.com   98.138.219.231
google.com  172.217.0.46
facebook.com    157.240.22.35

How to make it comma separated? e.g.yahoo.com,98.138.219.231 without any space? Also, what about the scan I have (json file). What do you suggest? just cat out.json | jq -r '[.host, .ip] | @tsv will do the job? I do not think so though. because the output file contains many objects.

prbinu commented 6 years ago
$ echo -e "yahoo.com\ngoogle.com\nfacebook.com" | ./tls-scan --cacert ../etc/tls-scan/ca-bundle.crt 2>/dev/null | jq -r '[.host, .ip] | @csv'
"yahoo.com","98.137.246.7"
"google.com","172.217.0.46"
"facebook.com","157.240.22.35"

If you need more customization, check out jq manual. You can even use simple grep tool to extract what you want. It seems like you scanned with `--pretty option enabled. pretty flag is for human consumption of output. But if you want the output to be processed by another program, you should enable line-based output.

ealashwali commented 6 years ago

True. I faced parsing problems due to weird certificates fields. But also discovered that I added comma after each } to form an array of objects, but I should not add comma after the last } which I think caused jq not to work.

prbinu commented 6 years ago

it would be helpful if you can share those weird certificate fields, and i can fix those bugs.

prbinu commented 5 years ago

@ealashwali Do you need any help with this issue? if not, please close it.