gchq / CyberChef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
https://gchq.github.io/CyberChef
Apache License 2.0
29.3k stars 3.28k forks source link

Bug report: Extract domain function ignores domains containing underscores #1889

Open nwCDDO opened 2 months ago

nwCDDO commented 2 months ago

Describe the bug When extracting domains from text the extract ignores domain names containing underscores. Underscores are valid characters in domain names and used quite often (for example in DMARC records).

To Reproduce Steps to reproduce the behaviour or a link to the recipe / input used to cause the bug:

  1. Add the Extract domains function to the recipe
  2. Paste data containing domain names with underscores in the Input box
  3. Click Bake!
  4. Domain names containing underscores are excluded from the Output

Expected behaviour Domain names containing underscores should be includes in the Output, whether at the start or in the middle of the FQDN.

Example Input urn:h:domain:sipdir.online.lync.com rrType SRV category DNS revision 13 rrDomain lewes-tc.gov.uk causeDomain sipdir.online.lync.com danglingType nxdomain causeDomainOther sipdir.online.lync.com rrEffectiveDomain _sip._tls.lewes-tc.gov.uk

Expected Output sipdir.online.lync.com lewes-tc.gov.uk sipdir.online.lync.com sipdir.online.lync.com _sip._tls.lewes-tc.gov.uk <- this does not appear in the Output

Desktop (if relevant, please complete the following information):

Additional context