rapid7 / recog

Pattern recognition for hosts, services, and content
Other
671 stars 199 forks source link

Update standard identifiers #303

Closed tsellers-r7 closed 3 years ago

tsellers-r7 commented 3 years ago

Description

The goal of this PR is to reset the standard identifiers after the change in PR #302. There are quite a few changes here that are due to updates being made while bin/recog_standardize wasn't working as expected. There are also cleanup of records that appear to be from the original PR.

NOTE: Most consumers of Recog will only care about the tweaks to the fingerprint files that standardized device types, vendor names, etc. which are toward the end of the diff.

The process to perform the update was to:

  1. Remove the contents of all .txt files in /identifiers
  2. Run for db in xml/*.xml; do ruby bin/recog_standardize $db -w ; done to sequentially work through each database file

CHANGES OF CONCERN

  1. Standardization on Title Case for os.device and hw.device since most values already were.
  2. All instances of Web cam or Web Cam in *.device fields are now IP Camera
  3. Removal of software_class.txt, software_family.txt, and software_product.txt in the identifiers directory since this codebase doesn't use these files at all.

How Has This Been Tested?

Local testing and review rspec

Types of changes

Checklist:

tsellers-r7 commented 3 years ago

CCing @hdm since you're one of the primary users/contributors for most of these files, the changed *.device fields, and Recog in general.

hdm commented 3 years ago

No issues on this side, I was more worried about changes to this affecting the Nexpose folks. If they are good to go, even better =D

hdm commented 3 years ago

"Web cam" -> "Web Cam" is something we do internally in Rumble today, but we left the original variant in place because that was how Nexpose tracked things. That whole category should probably be "IP Camera" these days.

tsellers-r7 commented 3 years ago

I think this is a great idea particularly since we already use IP Camera elsewhere (and I missed it). I'll reach out to a few folks internally.

tsellers-r7 commented 3 years ago

I have switched all instances Web Cam in *.device to IP Camera.

tsellers-r7 commented 3 years ago

The single instance of POS as a device type has been changed to Point of Sale. This brings the number of Point of Sale fingerprints to ... 2.

tsellers-r7 commented 3 years ago

@hdm - In this PR I deleted the indentifiers/software_*.txt files since they weren't used by bin/recog_standardize or Recog in general. In the case of software_class.txt I wasn't sure where you might have pulled the data from unless it was from non-Recog data such as Rumble.

Did you have a vision for how these should be used? Would you like to see them returned? If so, how would you generate them?

hdm commented 3 years ago

Thanks for the note! I think these came from the original email from the Nexpose team that started this.

hdm commented 3 years ago

We can probably omit from the standardize code and identifier list, but the source of those was Rapid7 originally.

tsellers-r7 commented 3 years ago

Thanks @hdm

We can probably omit from the standardize code and identifier list..

AFAIK, there wasn't any code in bin/recog_standardize that dealt with or generated the software_*.txt files. I removed them because I assumed they were just artifacts left over from the development process.

I think these came from the original email from the Nexpose team that started this.

Do you know if I was on that email thread or PRs that contain them? Also, we can take this to another medium if it's easier for you.

hdm commented 3 years ago

No PR, but I forwarded the thread by email. Thanks!