Donnype commented 1 year ago

Part 1:

From the mock report

This is fixed by:

Part 2:

What is a single system?

The definition of a system is now implicitly defined in the systems report.

[!IMPORTANT] There exists some ambiguity in the code with naming systems and services. One system can provide multiple services, such as Web, Mail, and Dicom. We currently call the latter the system type. Hence a system that is a Web System can also be a Mail System.

Current situation

The current implementation aggregates the following information from Octopoes to create a system:

Information used in a System

classDiagram
direction RL
class IP
class IPPort
class IPService
class Service
class Website
class SoftwareInstance
class Hostname
class Software

IP --> ResolvedHostname
ResolvedHostname --> Hostname
IPPort --> IP
IPService --> IPPort
IPService --> Service
SoftwareInstance --> IPPort
SoftwareInstance --> Software

Website --> IPService
Website --> Hostname

Example of resulting definition of a System

classDiagram
direction RL
class System

System: ip = "0.0.0.0"
System: hostnames = ["example.com", "b.example.com"]
System: services = ["Web", "Mail"]

Current issues

It is not possible for a system to have multiple IPs.
The current setup is not properly handling IPv6 (see #2184 and #2187).
We rely heavily on on-the-fly queries to define and categorize these systems

Possible solutions

There are a few things to fix:

Redefine a system: if the definition ends up being some variation of "ips, hostnames and the related system types", we should think about how we want to collect that information in a data structure and if we want to collect a separate entity in XTDB for it
Fix ipv6 handling
Do the system type categorization not on-the-fly but upfront with bits (see below)

The latter could be tackled in two ways.

1. Introducing objects to pre-annotate a lot of steps

We could leverage Bits to infer the system types from the services and software, which is what the Website object in fact is. This would mean bits such as website_discovery that take in a ResolvedHostname and IPService to add objects such as NameServer and MailServer. Here the issue might be that we do not want to depend on the hostname, so these could also only take in an IPService.

Furthermore, we could consider a new "System" object that aggregates Websites, MailServers and NameServers. Or perhaps just aggregate ips and query the Websites etc.?

Sketch if we introduce a System object (WIP, dashed line meaning optional)

classDiagram
direction RL

IP --> ResolvedHostname
ResolvedHostname --> Hostname
IPPort --> IP
IPService --> IPPort
IPService --> Service
SoftwareInstance --> IPPort
SoftwareInstance --> Software

Website --> IPService
Website --> Hostname

NameServer --> IPService
MailServer --> IPService

class System
System: services

class NameServer
NameServer --> IPService
NameServer ..> Hostname

class MailServer
MailServer --> IPService
MailServer ..> Hostname

System --|> Website
System --|> Website
System --|> NameServer
System --|> NameServer
System --|> MailServer
System --|> MailServer

2. Leverage current structure and make it more consistent

We are re-introducing the concept of a service here a bit, where we now focus on the categorization of a service and do some additional queries to enhance the result set. For instance, we query the IPService object to let nmap find mail servers, but use the Website object to determine if we are dealing with a web service. If we update the Service model a bit by introducing a type field.

We can then introduce bits that infer the service type from a Service, and write bits that add (IP) Services for the edge cases for Websites and SoftwareInstances where these objects imply an ip service. UPDATE: looking at the website_discovery bit, actually it seems that the same service-mapping logic is applied. So if we want more consistency here, we would redefine a Website as the result of querying objects in this path with a filter on the Service.type:

classDiagram
direction RL

ResolvedHostname --> Hostname
IP <-- ResolvedHostname
IPPort --> IP
IPService --> IPPort
Service --> IPService

Service: name
Service: type (optional)

And the Mail, Name and other services would be defined accordingly. A bit would infer the system types.

We could then still add a definition of a system by either aggregating ip addresses or ip services.

originalsouth commented 1 year ago

List of nmap supported services (service names):

Collapsed for brevity

``` 1c-server 3cx-tunnel 3m-sip 4d-server aastra-pbx abc acap acarsd access-remote-pc achat acmp acpc acti-control activefax activesync adabas adabas-d adb adobe-crossdomain afarianotify afbackup afp afs afs3-fileserver afsmain airdroid AirHID airmedia-audio airport-admin airserv-ng ajp12 ajp13 aleph allseeingeye altiris-agent amanda am-pdp amqp amx-icsp AndroMouse antivir anynet-sna anyremote apachemq apcupsd aperio-aaf aplus apollo-server app appguard-db appleevents apple-iphoto apple-sasl arcserve arcserve-gdd argus arkeia arkstats articy-server artsd arucer as2 asf-rmcp aspi as-servermap as-signon as-sts asterisk asterisk-proxy asus-nfc asus-transfer atalla athinfod audioworks audit authpoint automate autonomic-mrad autosys avaya-aom avg avk backdoor BackOrifice backupexec-remote bacnet bacula-fd bacula-sd bandwidth-test banner-ivu barracuda-bcp barracuda-dcagent bas basestation beep beidpcscd bentley-projectwise bf2rcon bgp biff bigant bindshell bitcoin bitcoin-jsonrpc bitdefender-ctrl bitkeeper bittorrent bittorrent-tracker bittorrent-udp-tracker bittorrent-utp bluecoat-logd bmc-perform-service bmc-tmart bnetd boinc bprd brassmonkey brio bro bru bruker-axs buildservice burk-autopilot byond bzfs bzr caicci caigos-conductus caigos-conspectus caigos-fundus caigos-pactor caigos-paratus caldav calibre-json ca-mq cassandra-native castv2 ca-unicenter cccam ccirmtd ccnet cddbp ceph chargen chat chat-ctrl check_mk chess chilliworx cirrato-client cisco7200sim cisco-lm ciscopsdm cisco-sla-responder cisco-smartinstall citadel citrix-ica citrix-ima citrix-licensing citynet clam clementine clementine-remote clickhouse clsbd cmae cmrcservice code42-messaging codeforge commvault complex-link computone-intelliserver compuware-lm concertosendlog concertotimesync conference conserver consul control-gc-ports control-m cops couchbase-tap cpu crestron-control crestron-ctp crestron-xsig crossfire crossmatchverifier cryptonote cso csta csync cvspserver cvsup cyrus-sync daap damewaremr darkcomet datamaxdb daytime dbsnmp dcc dec-notes decomsrv desktop-central devonthink dgld diablo2 dict digifort digifort-analytics digifort-lpr digital-sprite-status digitalwatchdog digi-usb directconnect directconnect-admin directfb directupdate diskmonitor distccd dlswpn d-mp dnastar dnet-keyproxy dnsix docbroker docker docker-swarm doka5 domain domaintime dominoconsole donkey dps-shell drac-console dragon drawpile drb drda drobo-dsvc drobo-nasd drweb dslcpe dsr-video dtls durian dusk dvr-video dynast-solver echo echolink econtagt ecopy ed2klink efi-webtools efi-workstation eftserv eggdrop egosecure-xmlrpc elasticsearch electra elm-agent elm-manager emc-datadomain emco-remote-screenshot emc-pp-mgmtsvc encase enemyterritory enistic-manager envisalink epmd epoptes-client epp ericom ericssontimestep erlang-node eth-jsonrpc etrayz-setup ets2 eve-online exacqvision exalead exec exportfs extron-serial fastcgi fastobjects-db fcgiwrap fcp fcpv2 fhem fiesta-online filemaker-xdbc filenet-pch file-replication filezilla finger firebird firewall flashconnect flexlm fms-core font-service foolscap fortinet-sso freeciv freedoko freelancer freenet freeswitch-event freevcs frozen-bubble fsae fsd ftp ftp-proxy fw1-log fw1-pslogon fw1-rlogin fw1-secureremote fw1-topology fyre g15daemon g6-remote gadu galaxy gamebots gamebots-control ganglia gcs-clientgw g-data-sec gd-comm gearman genetec-5400 genetec-5500 genetec-directory geovision-audio geovision-control geovision-mobile giop gipc git git-daemon gkrellm gms gnatbox gnats gntp gnupg gnuserv gnutella goldengate goldsync go-login gopher gopher-proxy gopro-json goverlan gpsd gpsd-ng greenplum groupwise guildwars2-heartbeat h2 h.239 h2-pg H.323-gatekeeper H.323-gatekeeper-discovery h323q931 H.323/Q.931 hadoop-ipc halfd hama-radio hama-radio2 haproxy-stats hasp-lm hazelcast hbase hbn3 hddtemp helpdesklog hillstone-vpn hl7-mlp hnap honeypot honeywell-confd honeywell-hscodbcn honeywell-ripsd hpdss hp-gsg hpiod hp-logic-analyzer hp-pjl hp-problemdiagnostics hp-radia hpssd hptsvr http http-ocsp http-proxy http-proxy-ctrl hue-link hylafax ibank2 ibm-db2 ibm-hmc ibm-mqseries icabrowser icap ice icecream ichat icontrolav2 icy ident igel-remote ilo ilo-console ilo-vm imap imap-proxy imaze-game imond impress-remote imsp inetd infopark informix ingrian-xml inspircd-spanning-tree insteon-plm instrument-manager intelatrac intermapper intermec-bri intersys-cache intertel-ctl intow intranetchat iodine iota-api ipcam iperf3 ipfs ipmi-advertiserd ipmi-usb ipp ipremote ipsi ir-alerts irc ircbot irc-proxy irods irr isakmp iscp iscsi issc iss-realsecure istat isymphony-cli isymphony-client isymphony-status itach ixia ixia-unknown jabber james-admin java-cim java-message-service java-object java-rmi jboss-remoting jdbc jd-gui jdwp jenkins-listener jetadmin jetbrains-lock jetdirect jicp jmon jmond jrpgt jsonrpc jtag junoscript jute jxta kapow-robot kazaa-http kazaa-peerpoint kdb keepnote kerberos kerberos-sec keriopfgui keriopfservice keyence-pc kguard kismet kismet-drone klogin kmldonkey kshell ksystemguard kumo-manager kumo-server kvm labtech-redirector landesk landesk-rc lanforge lanrev-agent lantronix-config laserfiche lastfm lcdproc ldap ldminfod lexlm lexmark-objectstore libp2p-multistream libvirt-rpc lineage-ii linuxconf lirc lisa listserv litecoin-jsonrpc lmtp lns loadrunner-vts logevent login loginserver loglogic logpad lorex-monitor lotusnotes lscp lsf-mbd lsx lucent-fwadm maas-rpc magent mail-admin mailq maplestory mapreduce mas-financial maxdb maya mcms-command mc-nmf mdns medcart mediad meetingmaker megafillers megaraid-monitor melange memcached mentorbs mep metasploit metasploit-msgrpc metasploit-xmlrpc metatrader meterpreter microsoft-ds midas millennium millennium-ils minebuilder minecraft minecraft-classic minecraft-pe minecraft-socketapi minecraft-votifier minisql misys-loaniq mitsubishi-qj71e71 mmouse mobilemouse modbus modem mogilefs mohaa mohaa-gamespy mon monetdb monetdb-ctl moneyworks mongodb monop monopd monsoon moo mosmig motorola-devmgr mp-automation mpd mpich2 mqtt mrtgext-nlm msdtc mserv msexchange-logcopier msn msrpc ms-sql-m ms-sql-s ms-wbt-server ms-wbt-server-proxy mtap mu-connect mud mudnames mu-game multicraft multiplicity munin mupdate murmur musicvr mwti-rpc mxie mydoom myproxy mysql nagios-nsca nameserver napster nat-pmp nbd ncacn_http ncat-chat ncd-diag ncid ncp ndb_mgmd ndmp ndv nessus netapp-filer netasq-admin netassistant netbackup netbackup-bpdbm netbios-ns netbios-ssn netbus netdevil netman NetMotionMobility netop netprobe netradio netrek net-rpc netsaint netsoul netstat netsupport netsupport-dna netsync netusb netwareip networkaudio niagara-fox nightwatchman nim nimbud-netmon nimp niprint nje nmea-0183 nngs nnsrv nntp nntp-proxy nomachine-nx novastor-backup nping-echo nrpep nsclient nsi nsunicast ntop-http ntp ntrip nut nutcracker nuttcp nuuo-vnc nvidia-update obiee oem-agent oftp olsrd-jsoninfo olsrd-txtinfo omapi omniback omniinet omp oo-defrag openerp openflow openfpc openlookup opentable opentable-listener openttd openvpn openvpn-management opinionsquare opsec-ufp optommp oracle oracle-db-rmi oracle-mts oracle-nm oracle-tns oracle-vs ormi osiris ossec-agent osu-nms ouman-trend outpost-ctl ovhcheckout ovs-agent p4d pafserver palace palm-hotsync paloalto-agent papouch-tme parallels-server para-ups paromed partimage pathfinder-xml patrol pblocald pbmasterd pbs pbs-maui pbx-alarm pc-anywhere pcanywheredata pc-duo pc-duo-gw pcmeasure pcmiler pc-monitor pcp pcs-partner pcworx perfd pfservice pgas pgbouncer pgpool ph-addressbook pharos pigpio pi-hole-stats pioneers pioneers-meta pjlink pksd pmcd pmud policy polycom-mgc pop2 pop3 pop3-proxy pop3pw portlistener postgresql postgrey postx-reporting pot powerchute poweroff ppp pppctl pptp precomd prelude-manager printer printer-admin printer-json printeron print-monitor prisontale priv-print proconos progress pso-gate pso-login psql psql-btrieve ptcp ptp-ip pvx pwdgen pycharm pyro python-mp qaweb qcheck qconn qds qemu-vlan qmqp qnap-rtrr qnap-transcode qotd qsp-proxy qtopia-transfer quagga quake quake2 quake3 quake3-master quasar quest_launcher r1soft-cdp radius radmin radmind raid-mon raop rationalsoft ratnj razor2 rbnb rcon rconj rds realplayfavs realport redcarpet redis relp remoteanything remote-control RemoteMouse remote-rac remote-volume remoting renderer resin-watchdog resvc rethinkdb-client rethinkdb-intracluster retrospect rexec rfactor-monitor rfbuoy rfidquery rgpsp rhapsody rhpp riak-pbc riegl-license rifa-dvr righteous-backup ripbot riverbed-stats rlm rmate rmmd roku roku-remote rotctld routeros-api routersetup rowmote rpacd rpcapd rpcbind rpd rsa-appliance rsa-authmgr rse rsync rtdscchcch rtmp rtp rtrdb rtsp rtsp-proxy runes-of-magic s2-emerge saft samsung-sap samsung-twain sand-db sap-gui sap-its sap-logviewer saprouter sarad sassafras satstrat sauerbraten scalix-ual scanager scifinder scmbug scmm sdcomm sdlog seagull-lm securepath secure-socket securetransport sentinel-lm serial serialnumber serversettingsd service-monitor servicetags seti-proxy sftp sgms sguil shaiya sharefolder sharp-remote sharp-twain shell shivahose shoutcast shoutirc siebel siemens-logo siemens-xtrace sieve signiant silc silkroad-online sip sip-proxy slimp3 slingbox slnp slp-srvreg slurm slx sma-solar smpp smtp smtp-proxy smtp-stats smux snapmirror snmp snpp soap sobby socks4 socks5 socks-proxy softether-rpc softplc softros-im softwarepatrol soldat solfe solidworks-remotesolve solproxy sonicmq sonork sopcast sophos sourceoffice sourceviewerserver spamassassin spark spectraport speech speechd sphereicall sphinx-search spice spideroak splashtop spmd spotify-login spy-net sqlmonitor squeezecenter squeezecli srcds srun srvloc ssc-agent ssdp ssh ssl ssl/consul-rpc ssl/http ssl/imap ssl/openvas ssl/pop3 ssl/sophos ssl/steam ssl/vmware-auth sstp stageremote starbound stargazer starutil statd stingray stockfish stomp storagecraft-image stratum sumatra-ds sun-alom sunscreen-adm svnserve svrloc sybase-adaptive sybaseanywhere sybase-backup sybase-monitor symantec-av symantec-esm synchroedit syncplay-json syncsort-cmagent synergy synobtrfsreplicad sysinfo systat talesofpirates-gate talk tally-census tandem-print tarantool taxinav tcpmux tcpwrapped tcsd tdm t-doc-2000 teamspeak2 teamspeak-tcpquery teamtalk teamviewer telecom-misc telematics telemecanique telnet telnet-proxy teradata terraria textui tftp tgcmd thinprint thrift-binary tibia timbuktu time timeedit tina tinc tinyfw tivo-remote tmail tn3270 tng-dts togamelogin topdesk tor tor-control tor-info tor-orport tor-socks trackerlink trackmania-gbx traficon-flux transbase transferimg trasker trendnet-webcam trillian trinitycore trustwave ts3 tsd tsdns ttscp tunnel-test tunnelvision tuxedo-wsl ubiquiti-discovery uc4 ums-webviewer unicorn-ils unitrends-backup univention-json unknown unreal unreal-media upnp ups upsd uptime-agent urbackup urp usher utorrent-udp utrmcd utsessiond utsvc uucp valentinadb valve-steam varnish-cli vdr venti ventrilo vertica vhcs video vidyoroom virtualgl virtualhere visitview vizio-tv vmware-aam vmware-auth vmware-print vnc vnc-http vnetd vp3 vspe vss vtp vtun vulnserver vuze-dht vzagent warcraft watchguard wbem wcbackup weather webcache webdav webmin websense-eim websm websocket webster weprint wesnoth whois wifi-mouse wikidpad winagents-hyperconf winbox wincomm wincor-atm wingate wingate-control winlog wins winshell wms wolfssl workrave wow wrproxy ws-discovery wsman wtam wub-command wyse-devmgr X11 xamarin xbmsp xboxdebug xdmcp xfce-session xfs xine-remote xinetd xmail-ctrl xmbmon xml-print xml-rpc xmlsysd xmpp xmpp-transport xns xplorer xtel xtunnels yamaha-comm yiff zabbix zebedee zebra zeiss-axio zend-java-bridge zenimaging zenius-sms zenworks zeo zeo-monitor zftp-admin zmodem zmtp zookeeper zos-commserver ```

As found by

grep "^match" <(curl -s https://raw.githubusercontent.com/nmap/nmap/master/nmap-service-probes) | awk '{print $2}' | sort | uniq

underdarknl commented 1 year ago

So we need to define what is a single system. A single system might span multiple Ip's, bound together by a single hostname. A single system might run multiple services (which can be grouped by the nmap-services).

noamblitz commented 1 year ago

I think we defined a system as:

Multiple hostnames that resolve to one ip address where at least one of the hostnames or the ip address has a declared scan level that is at least L1.

What the system type is should be defined by what ip services are served by the machine with that ip address.

Since we cannot see whether the ipv4 and ipv6 addresses are addresses of a single, or multiple machines, we define then as multiple systems.

noamblitz commented 1 year ago

The declared scan level part does not have to queried since that query is done at the report ooi selection step.

originalsouth commented 1 year ago

Since we cannot see whether the ipv4 and ipv6 addresses are addresses of a single, or multiple machines, we define then as multiple systems.

There is indeed no definite way, but practically we could compare fingerprints; especially when there is encryption involved.

noamblitz commented 1 year ago

Yes for sure. I meant with the current OOIs, should've been more specific.

Donnype commented 11 months ago

Looking at this again, should we consider it closed @noamblitz @underdarknl?

underdarknl commented 10 months ago

What the system type is should be defined by what ip services are served by the machine with that ip address.

This might not be enough, we might also need to be able to look at what software is running. Not everything that speaks https is a webserver system, it might be more specific based on what we see on that https-server.

noamblitz commented 10 months ago

So I've been thinking about it and think we really first should discuss it with stakeholders. If we are going to keep systems defined like they are now (purely service-based), a simple BIT should be the solution. If there is more, such as Software like @underdarknl is mentioning, it will quickly become too complicated for the current BITS. For example, Software -> SoftwareInstance -> IPService -> IPPort -> IPaddress is not possible in the BITS. We should also keep in mind that there is no software discovery on ipservices currently, so including that in the system definition will force us to write boefjes for that (delaying a bit).

stephanie0x00 commented 9 months ago

What is the problem that we are trying to solve? This is not entirely clear to me.

Is that the very 'basic' definition of a system? To me a system would be defined as an unique combination (call it a primary key) of a Hostname with an IP address. In the case of IPv4 and IPv6, that would technically mean there is now two systems, which makes sense. As the security configurations for both IPv4 and IPv6 can intentionally be different. If required for some practical reason, the primary key could be extended with a Port, if that helps to define services.

Is it that the naming of the 'systems report' is confusing? Naming it a 'services report' maybe fits better with what we are actually describing in the report.

Something else?

stephanie0x00 commented 7 months ago

I made an 'exercise' sheet so the community can help us out. Any suggestions, comments or things that could be improved?

stephanie0x00 commented 7 months ago

Input from discussion meeting:

One possible solution might be to use the system type labels (such as Web, Mail, DNS) to define the system.
We need to keep in mind how this can be implemented using bits, where the limitations are, and/or what we want to implement in the future.

Currently systems work with IPServices, so when a hostname has an IP and the IP offers HTTPS, then we say that IP+Hostname is a system, with the system type 'Web'. This only works for situations where IPservices are detected. When there is no IPService, there is currently also no 'system'.

dekkers commented 7 months ago

You can also have the following scenario: thousands of hostnames with different owners that point the same IP address (for example cloudflare)

stephanie0x00 commented 7 months ago

Conclusion discussionmeeting 23/04/2024:

Update image above to add services
Find stakeholders to share
Discuss the image.

stephanie0x00 commented 6 months ago

Adjusted image with some added services (HTTP, SSH, SMTP):

Define-Systems-3 drawio

stephanie0x00 commented 6 months ago

Another suggestion could be to add tags to IPs and/or hostnames such that the user can define whether or not the IPs are part of a cloud solution and/or shared hosting, as these are the issues where issues would occur. Based on these tags we can then 'change' the definition of a system as we go. We could perform some basic checks ourselves by performing whois queries on the IP space and combine this with perhaps the DNS configuration for various cloud solutions. If a cloud solution like AWS/Azure etc is detected we can automatically give it this tag.

A suggestion from @underdarknl was to do this using bits that read a config.

stephanie0x00 commented 6 months ago

The definition of a system remains complex. Everybody had issues filling in the 'define systems' templates from above. During the discussion meeting we defined the problems we are facing, and plan to create separate solutions for each of the problems, instead of solving the 'one big problem'.

These current identified problems are:

Cloud solutions: how are cloud hosted websites/systems defined? For this we could use the ingress proxies on the cloud side and create a definition based on these lists with known IPs. Subticket: #2977
Aliased websites (websites with www.example.com instead of example.com): Often these behave as a single website, but factual these are different websites. Do we see www.example.com and example.com as a single website? Or do we keep them separate. An idea could be to ask a config to the user with: are these websites similar? Maybe we define them as similar if the URLs and the IPs are similar and/or the subgraph. #2978
System report: a wish from the community was to get an overview of all systems in the system report. What belongs together. This is an impossible question as we cannot define the cross product (since nobody was able to fill in above images with their definition of a system).

Additional discussion: We do not want the user to define/label all systems before scanning by default. This works for a limited number of systems, however when having to define all systems for a /8 this become tedious and annoying work. And we cannot assume that the end users of OpenKAT are sufficiently aware of what systems are. Maybe this could be added as an 'advanced' option for those users who want things differently.

minvws / nl-kat-coordination

Define systems #2034

Part 1:

From the mock report

Part 2:

What is a single system?

Current situation

Information used in a System

Example of resulting definition of a System

Current issues

Possible solutions

1. Introducing objects to pre-annotate a lot of steps

Sketch if we introduce a System object (WIP, dashed line meaning optional)

2. Leverage current structure and make it more consistent