What is a "Host"? - Githubissues

Rhynorater commented 6 years ago

Expected Behavior

ReconJSON is expected to provide a data standard that accommodates all different types of recon. Recon is designed around scope. Scope can be defined in many different ways: single ips, ip ranges, wildcard domains, and specific subdomains. As a result, we need a format that will accommodate those standards and their individual definitions of a host.

Current Behavior

The current behaviour doesn't define what identifies a "unique host." As a result, we can run into issues based off of the different types of scope mentioned above. For example, a wildcard domain scope might say that "example.acme.com" is in scope. However, "example.acme.com" resolves to 54.0.0.1 AND 54.0.0.2. As a result, we have 2 "physical" systems that resolve to one host in ReconJSON format. This could result in conflicts if 54.0.0.1 has port 22 open and 54.0.0.2 has port 22 filtered.

However, on the flip side, if we define a host as an IP address, then we can run into issues where we get duplicates. Consider the above scenario with example.acme.com resolving to 54.0.0.1 and 54.0.0.2. If we define the IP address as the unique identifier, our dataset will look like this:

{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.1"}
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.2"}

Possible Solution

There are several possible solutions that I can conceive:

We define the unique identifier for a host on the first line of the file and leave it up to the parser to resolve. Our file would then look like this:

{{"type":metadata", "version":"1.0.0", "hostIdentifier":"subdomain"},
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.1", "dns":{"A":["54.0.0.1", "54.0.0.2"]}}
}

OR

{{"type":metadata", "version":"1.0.0", "hostIdentifier":"ip"},
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.1"},
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.2"}
}

This approach solves the issue, but makes the file more difficult to parse and merge into ones own tool. This also hurts the uniformity of the standard.

This can be left up to the user to decide based off of the tool they are using. Consider the following applications:

Portscanner If the portscanner is passed in an object looking like this:
```
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.1", "dns":{"A":["54.0.0.1", "54.0.0.2"]}}
```
What should we expect it to do? I would expect that it would take the "ip" field and scan that and return the results. However, what if it is passed this from the results of a Subdomain enumeration with no dns resolve:
```
{"type":"Host", "subdomain":"example.acme.com"}
```
Well, I would expect it to resolve the subdomain and return something like this:
```
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.1", "ports":{...}}
```
Or perhaps even without the subdomain (as nmap does):
```
{"type":"Host","ip":"54.0.0.1", "ports":{...}}
```
In these scenarios we know that the port scanner is focusing on the IP address and the parser will need to tell as much.
Subdomain enumeration Say you give a subdomain enumerator the domain "acme.com." The following will be outputted (if no resolve feature is turned on).
```
{"type":"Host", "subdomain":"example.acme.com"}
{"type":"Host", "subdomain":"example2.acme.com"}
```
and if a resolve feature is turned on, we might see something like this:
```
{"type":"Host", "subdomain":"example.acme.com", "ip":"54.0.0.1" }
{"type":"Host", "subdomain":"example2.acme.com", "ip":"54.0.0.1"}
```
We can see in this scenario that the subdomain enumeration tool considers these 2 different host because the tool is focused on Subdomain enumeration.

In this case, the user (or parser) would be responsible for merging this data together into the format that is most reasonable for their usecase.

Ice3man543 commented 6 years ago

The second idea is a good one. Let the tool or the parser decide what is a host for it.

c0rv4x commented 6 years ago

The second idea looks much better. I faced a similar issue when building a tool and decided to separate ips from hosts.

It is mostly about storing data, but may be this would be useful. I have defined a host equal to hostname. So the idea is the following:

Ports are a property of an ip address, not the host
Hosts-IPs is a many-to-many relationship, so i would probably make it a totally different entity, or each host can have its ips and each ip can have its hosts. The latter is space consuming but you win some speed

michenriksen commented 6 years ago

Yes, the second idea sounds good to me too. Some tools will provide IP information on hosts while others will provide domain names. All the different tools can load each others' reconjson files and further populate it with more data instead of creating their own.

0xdevalias commented 5 years ago

Agree with @iad42's comments above. In a system I have been working on, I have apex (eg. example.com), host (eg. foo.example.com) which links to an apex, ip (eg. 1.2.3.4), etc as their own entities. A host will always have a link to an apex. An ip can have links to host records. ports are linked to an ip. Etc. It maps itself nicely into a graph layout, which could then be translated back to a 'flatter' json layout.

ReconJSON / ReconJSON

What is a "Host"? #8

Expected Behavior

Current Behavior

Possible Solution