18F / domain-scan

A lightweight pipeline, locally or in Lambda, for scanning things like HTTPS, third party service use, and web accessibility.
Other
369 stars 139 forks source link

Adopt some of the clean-up from dhs-ncats/gatherer #288

Open jsf9k opened 5 years ago

jsf9k commented 5 years ago

In dhs-ncats/gatherer we do some cleaning up of the domains returned by this project's gather pipeline. (For example, we remove \r characters that appear in the middle of host names and we condense repeated . characters into a single one.) Does it make sense to do some of those things here instead?

konklone commented 5 years ago

Yes, I think it would totally make sense to do that here, since these all represent generic errors in hostname representation.