elceef / dnstwist

Domain name permutation engine for detecting homograph phishing attacks, typo squatting, and brand impersonation
https://dnstwist.it
Apache License 2.0
4.73k stars 753 forks source link

Web app phash feature #192

Closed jdieguez89 closed 6 months ago

jdieguez89 commented 1 year ago

I am trying to add support for phash in the web app scan. These are the steps I have done so far:

  1. Modified the Dockerfile to add the libraries installation

Dockerfile ` FROM debian:stable-slim

WORKDIR /opt/dnstwist

RUN apt-get update && \ export DEBIAN_FRONTEND=noninteractive && \ apt-get install -y --no-install-recommends python3-tld python3-dnspython python3-geoip gunicorn3 python3-flask &&\ apt-get install -y --no-install-recommends python3-whois ca-certificates && \ apt-get install -y python3-ssdeep python3-tlsh && \ apt-get install -y --no-install-recommends python3-pil python3-selenium chromium-driver && \ apt-get autoremove -y && \ apt-get clean && \ rm -rf /var/lib/apt/lists/*

COPY ./webapp.py /opt/dnstwist/

COPY ./dnstwist.py /opt/dnstwist/

EXPOSE 8000

CMD ["gunicorn3", "webapp:app", "--bind", "0.0.0.0:8000", "--workers", "20", "--threads", "10000", "--log-level", "debug"] `

  1. I add the option to scan, and also the screenshot directory in the scan function

` def scan(self):

    for domain in self.permutations:
        self.jobs.put(domain)
    for _ in range(self.thread_count):
        worker = dnstwist.Scanner(self.jobs)
        worker.option_extdns = dnstwist.MODULE_DNSPYTHON
        worker.option_geoip = dnstwist.MODULE_GEOIP
        worker.option_phash = dnstwist.MODULE_SELENIUM
        worker.screenshot_dir = "/home"
        if self.nameservers:
            worker.nameservers = self.nameservers.split(',')
        worker.start()`
  1. I change the settings in order to grant more resources to the phash scan

PORT = int(os.environ.get('PORT', 8000)) HOST= os.environ.get('HOST', '127.0.0.1') THREADS = int(os.environ.get('THREADS', dnstwist.THREAD_COUNT_DEFAULT)) NAMESERVERS = os.environ.get('NAMESERVERS') or os.environ.get('NAMESERVER') SESSION_TTL = int(os.environ.get('SESSION_TTL', 3600)) SESSION_MAX = int(os.environ.get('SESSION_MAX', 1000)) # max concurrent sessions MEMORY_LIMIT = human_to_bytes(os.environ.get('MEMORY_LIMIT', '8048m')) DOMAIN_MAXLEN = int(os.environ.get('DOMAIN_MAXLEN', 40)) WEBAPP_HTML = os.environ.get('WEBAPP_HTML', 'webapp.html') WEBAPP_DIR = os.environ.get('WEBAPP_DIR', os.path.dirname(os.path.abspath(__file__)))

This should be enough to activate this feature, the scan starts, and for a few moments all seems fine, but when I query the status, the session has been deleted, I trace the janitor function buts is not cleaning the session, here is the docker image output:

2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [1] [INFO] Handling signal: term 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [7] [INFO] Worker exiting (pid: 7) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [8] [INFO] Worker exiting (pid: 8) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [9] [INFO] Worker exiting (pid: 9) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [10] [INFO] Worker exiting (pid: 10) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [13] [INFO] Worker exiting (pid: 13) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [14] [INFO] Worker exiting (pid: 14) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [16] [INFO] Worker exiting (pid: 16) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [31] [INFO] Worker exiting (pid: 31) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [21] [INFO] Worker exiting (pid: 21) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [30] [INFO] Worker exiting (pid: 30) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [19] [INFO] Worker exiting (pid: 19) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [44] [INFO] Worker exiting (pid: 44) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [42] [INFO] Worker exiting (pid: 42) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [23] [INFO] Worker exiting (pid: 23) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [38] [INFO] Worker exiting (pid: 38) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [24] [INFO] Worker exiting (pid: 24) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [26] [INFO] Worker exiting (pid: 26) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [28] [INFO] Worker exiting (pid: 28) 2023-06-07 15:19:44 [2023-06-07 12:19:44 +0000] [35] [INFO] Worker exiting (pid: 35) 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [40] [INFO] Worker exiting (pid: 40) 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [INFO] Shutting down: Master 2023-06-07 15:19:44 *********ACTIVE MODULES******* 2023-06-07 15:19:44 True 2023-06-07 15:19:44 True 2023-06-07 15:19:44 True 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [DEBUG] Current configuration: 2023-06-07 15:19:45 config: ./gunicorn.conf.py 2023-06-07 15:19:45 wsgi_app: None 2023-06-07 15:19:45 bind: ['0.0.0.0:8000'] 2023-06-07 15:19:45 backlog: 2048 2023-06-07 15:19:45 workers: 20 2023-06-07 15:19:45 worker_class: sync 2023-06-07 15:19:45 threads: 10000 2023-06-07 15:19:45 worker_connections: 1000 2023-06-07 15:19:45 max_requests: 0 2023-06-07 15:19:45 max_requests_jitter: 0 2023-06-07 15:19:45 timeout: 30 2023-06-07 15:19:45 graceful_timeout: 30 2023-06-07 15:19:45 keepalive: 2 2023-06-07 15:19:45 limit_request_line: 4094 2023-06-07 15:19:45 limit_request_fields: 100 2023-06-07 15:19:45 limit_request_field_size: 8190 2023-06-07 15:19:45 reload: False 2023-06-07 15:19:45 reload_engine: auto 2023-06-07 15:19:45 reload_extra_files: [] 2023-06-07 15:19:45 spew: False 2023-06-07 15:19:45 check_config: False 2023-06-07 15:19:45 print_config: False 2023-06-07 15:19:45 preload_app: False 2023-06-07 15:19:45 sendfile: None 2023-06-07 15:19:45 reuse_port: False 2023-06-07 15:19:45 chdir: /opt/dnstwist 2023-06-07 15:19:45 daemon: False 2023-06-07 15:19:45 raw_env: [] 2023-06-07 15:19:45 pidfile: None 2023-06-07 15:19:45 worker_tmp_dir: None 2023-06-07 15:19:45 user: 0 2023-06-07 15:19:45 group: 0 2023-06-07 15:19:45 umask: 0 2023-06-07 15:19:45 initgroups: False 2023-06-07 15:19:45 tmp_upload_dir: None 2023-06-07 15:19:45 secure_scheme_headers: {'X-FORWARDED-PROTOCOL': 'ssl', 'X-FORWARDED-PROTO': 'https', 'X-FORWARDED-SSL': 'on'} 2023-06-07 15:19:45 forwarded_allow_ips: ['127.0.0.1'] 2023-06-07 15:19:45 accesslog: None 2023-06-07 15:19:45 disable_redirect_access_to_syslog: False 2023-06-07 15:19:45 access_log_format: %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" 2023-06-07 15:19:45 errorlog: - 2023-06-07 15:19:45 loglevel: debug 2023-06-07 15:19:45 capture_output: False 2023-06-07 15:19:45 logger_class: gunicorn.glogging.Logger 2023-06-07 15:19:45 logconfig: None 2023-06-07 15:19:45 logconfig_dict: {} 2023-06-07 15:19:45 syslog_addr: udp://localhost:514 2023-06-07 15:19:45 syslog: False 2023-06-07 15:19:45 syslog_prefix: None 2023-06-07 15:19:45 syslog_facility: user 2023-06-07 15:19:45 enable_stdio_inheritance: False 2023-06-07 15:19:45 statsd_host: None 2023-06-07 15:19:45 dogstatsd_tags: 2023-06-07 15:19:45 statsd_prefix: 2023-06-07 15:19:45 proc_name: None 2023-06-07 15:19:45 default_proc_name: webapp:app 2023-06-07 15:19:45 pythonpath: None 2023-06-07 15:19:45 paste: None 2023-06-07 15:19:45 on_starting: <function OnStarting.on_starting at 0xffff8e45aaf0> 2023-06-07 15:19:45 on_reload: <function OnReload.on_reload at 0xffff8e45ac10> 2023-06-07 15:19:45 when_ready: <function WhenReady.when_ready at 0xffff8e45ad30> 2023-06-07 15:19:45 pre_fork: <function Prefork.pre_fork at 0xffff8e45ae50> 2023-06-07 15:19:45 post_fork: <function Postfork.post_fork at 0xffff8e45af70> 2023-06-07 15:19:45 post_worker_init: <function PostWorkerInit.post_worker_init at 0xffff8e3e50d0> 2023-06-07 15:19:45 worker_int: <function WorkerInt.worker_int at 0xffff8e3e51f0> 2023-06-07 15:19:45 worker_abort: <function WorkerAbort.worker_abort at 0xffff8e3e5310> 2023-06-07 15:19:45 pre_exec: <function PreExec.pre_exec at 0xffff8e3e5430> 2023-06-07 15:19:45 pre_request: <function PreRequest.pre_request at 0xffff8e3e5550> 2023-06-07 15:19:45 post_request: <function PostRequest.post_request at 0xffff8e3e55e0> 2023-06-07 15:19:45 child_exit: <function ChildExit.child_exit at 0xffff8e3e5700> 2023-06-07 15:19:45 worker_exit: <function WorkerExit.worker_exit at 0xffff8e3e5820> 2023-06-07 15:19:45 nworkers_changed: <function NumWorkersChanged.nworkers_changed at 0xffff8e3e5940> 2023-06-07 15:19:45 on_exit: <function OnExit.on_exit at 0xffff8e3e5a60> 2023-06-07 15:19:45 proxy_protocol: False 2023-06-07 15:19:45 proxy_allow_ips: ['127.0.0.1'] 2023-06-07 15:19:45 keyfile: None 2023-06-07 15:19:45 certfile: None 2023-06-07 15:19:45 ssl_version: 2 2023-06-07 15:19:45 cert_reqs: 0 2023-06-07 15:19:45 ca_certs: None 2023-06-07 15:19:45 suppress_ragged_eofs: True 2023-06-07 15:19:45 do_handshake_on_connect: False 2023-06-07 15:19:45 ciphers: None 2023-06-07 15:19:45 raw_paste_global_conf: [] 2023-06-07 15:19:45 strip_header_spaces: False 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [INFO] Starting gunicorn 20.1.0 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [DEBUG] Arbiter booted 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [INFO] Listening at: http://0.0.0.0:8000 (1) 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [INFO] Using worker: threads 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [1] [WARNING] No keepalived connections can be handled. Check the number of worker connections and threads. 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [7] [INFO] Booting worker with pid: 7 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [8] [INFO] Booting worker with pid: 8 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [9] [INFO] Booting worker with pid: 9 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [11] [INFO] Booting worker with pid: 11 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [14] [INFO] Booting worker with pid: 14 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [17] [INFO] Booting worker with pid: 17 2023-06-07 15:19:45 [2023-06-07 12:19:45 +0000] [18] [INFO] Booting worker with pid: 18 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [20] [INFO] Booting worker with pid: 20 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [23] [INFO] Booting worker with pid: 23 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [24] [INFO] Booting worker with pid: 24 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [27] [INFO] Booting worker with pid: 27 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [28] [INFO] Booting worker with pid: 28 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [30] [INFO] Booting worker with pid: 30 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [31] [INFO] Booting worker with pid: 31 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [33] [INFO] Booting worker with pid: 33 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [34] [INFO] Booting worker with pid: 34 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [36] [INFO] Booting worker with pid: 36 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [41] [INFO] Booting worker with pid: 41 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [42] [INFO] Booting worker with pid: 42 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [44] [INFO] Booting worker with pid: 44 2023-06-07 15:19:46 [2023-06-07 12:19:46 +0000] [1] [DEBUG] 20 workers 2023-06-07 15:20:05 [2023-06-07 12:20:05 +0000] [34] [DEBUG] POST /api/scans 2023-06-07 15:20:06 [2023-06-07 12:20:06 +0000] [34] [DEBUG] Closing connection. 2023-06-07 15:20:14 [2023-06-07 12:20:14 +0000] [34] [DEBUG] GET /api/scans/ec02e559-3d01-4492-a280-e25ed8d1280a 2023-06-07 15:20:14 [2023-06-07 12:20:14 +0000] [34] [DEBUG] Closing connection. 2023-06-07 15:20:16 [2023-06-07 12:20:16 +0000] [42] [DEBUG] GET /api/scans/ec02e559-3d01-4492-a280-e25ed8d1280a 2023-06-07 15:20:16 [2023-06-07 12:20:16 +0000] [42] [DEBUG] Closing connection. Any suggestions??

jdieguez89 commented 1 year ago

image

I can run a phash scan inside the docker container where the web app is deployed. Anyone?

elceef commented 1 year ago

I'd avoid setting MEMORY_LIMIT for now. It's difficult to predict memory utilization with the pHash feature enabled.