projectdiscovery / dnsx

dnsx is a fast and multi-purpose DNS toolkit allow to run multiple DNS queries of your choice with a list of user-supplied resolvers.
https://docs.projectdiscovery.io/tools/dnsx
MIT License
2.14k stars 238 forks source link

Performance discrepancy: dnsx as library lag compared to CLI #512

Closed iamnihal closed 1 week ago

iamnihal commented 9 months ago

dnsx version: v1.1.6

Current Behavior:

I'm utilizing dnsx as a library in my program, generating subdomains from alterx, and resolving them through dnsx. The problem arises when I use dnsx as a library; the time it takes to resolve 691 subdomains is 1 minute and 14 seconds. However, when I perform the same operation using the dnsx binary CLI, it takes only 1.75 seconds. As suggested by @dogancanbakir, I've attempted passing default values, but that doesn't seem to resolve the issue.

Expected Behavior:

When used as a library, dnsx should resolve the subdomains in the same time as when used as a CLI utility."

Steps To Reproduce:

I'm using dnsx in my program as below:

package main

import (
    "fmt"
    "time"

    "github.com/projectdiscovery/dnsx/libs/dnsx"
)

func DnsxEnrich(subdomains []string) ([]string, error) {

    var result []string

    dnsClient, err := dnsx.New(dnsx.Options{
        BaseResolvers: []string{
            "udp:1.1.1.1:53",         // Cloudflare
            "udp:1.0.0.1:53",         // Cloudflare
            "udp:8.8.8.8:53",         // Google
            "udp:8.8.4.4:53",         // Google
            "udp:9.9.9.9:53",         // Quad9
            "udp:149.112.112.112:53", // Quad9
            "udp:208.67.222.222:53",  // Open DNS
            "udp:208.67.220.220:53",  // Open DNS
        },
        MaxRetries:        2,
        QuestionTypes:     dnsx.DefaultOptions.QuestionTypes,
        Trace:             false,
        TraceMaxRecursion: dnsx.DefaultOptions.TraceMaxRecursion,
        Hostsfile:         false,
        OutputCDN:         false,
    })
    if err != nil {
        fmt.Printf("err: %v\n", err)
        return []string{}, fmt.Errorf("error creating dnsx client: %v", err)
    }

    timeStart := time.Now()

    for _, subdomain := range subdomains {
        res, _ := dnsClient.Lookup(subdomain)

        if len(res) > 1 {
            result = append(result, subdomain)
        }
    }

    fmt.Println("\nCompleted in", time.Since(timeStart))
    return result, nil
}

When I pass 691 subdomains to the DnsxEnrich function, it takes approximately 1 minute and 14 seconds to resolve all the subdomains. In contrast, using dnsx as the CLI tool completes the process in just 1.75 seconds.

Output:

dnsx as CLI

dnsx CLI

dnsx as library

dnsx library

lordsky commented 9 months ago

dnsx version: v1.1.6

Current Behavior:

I'm utilizing dnsx as a library in my program, generating subdomains from alterx, and resolving them through dnsx. The problem arises when I use dnsx as a library; the time it takes to resolve 691 subdomains is 1 minute and 14 seconds. However, when I perform the same operation using the dnsx binary CLI, it takes only 1.75 seconds. As suggested by @dogancanbakir, I've attempted passing default values, but that doesn't seem to resolve the issue.

Expected Behavior:

When used as a library, dnsx should resolve the subdomains in the same time as when used as a CLI utility."

Steps To Reproduce:

I'm using dnsx in my program as below:

package main

import (
  "fmt"
  "time"

  "github.com/projectdiscovery/dnsx/libs/dnsx"
)

func DnsxEnrich(subdomains []string) ([]string, error) {

  var result []string

  dnsClient, err := dnsx.New(dnsx.Options{
      BaseResolvers: []string{
          "udp:1.1.1.1:53",         // Cloudflare
          "udp:1.0.0.1:53",         // Cloudflare
          "udp:8.8.8.8:53",         // Google
          "udp:8.8.4.4:53",         // Google
          "udp:9.9.9.9:53",         // Quad9
          "udp:149.112.112.112:53", // Quad9
          "udp:208.67.222.222:53",  // Open DNS
          "udp:208.67.220.220:53",  // Open DNS
      },
      MaxRetries:        2,
      QuestionTypes:     dnsx.DefaultOptions.QuestionTypes,
      Trace:             false,
      TraceMaxRecursion: dnsx.DefaultOptions.TraceMaxRecursion,
      Hostsfile:         false,
      OutputCDN:         false,
  })
  if err != nil {
      fmt.Printf("err: %v\n", err)
      return []string{}, fmt.Errorf("error creating dnsx client: %v", err)
  }

  timeStart := time.Now()

  for _, subdomain := range subdomains {
      res, _ := dnsClient.Lookup(subdomain)

      if len(res) > 1 {
          result = append(result, subdomain)
      }
  }

  fmt.Println("\nCompleted in", time.Since(timeStart))
  return result, nil
}

When I pass 691 subdomains to the DnsxEnrich function, it takes approximately 1 minute and 14 seconds to resolve all the subdomains. In contrast, using dnsx as the CLI tool completes the process in just 1.75 seconds.

Output:

dnsx as CLI

dnsx CLI

dnsx as library

dnsx library

Same question. I read the code, CLI options.Threads default is 100, but use the library don't have the thread option to set.

calab33p commented 2 weeks ago

Have you tried creating your own threads @iamnihal ? That is what I ended up doing. The alternative may be to call the DNSX code from a runner interface, but that is inside internal/

iamnihal commented 2 weeks ago

Have you tried creating your own threads @iamnihal ? That is what I ended up doing. The alternative may be to call the DNSX code from a runner interface, but that is inside internal/

Hey @calab33p - I haven't tried creating my own threads for this. How has it worked for you? Were you able to match the speed of the binary DNSX?

dogancanbakir commented 1 week ago

@iamnihal Yes, you need to handle the concurrency. See an example where we use dnsx https://github.com/projectdiscovery/tldfinder/pull/52/files, and this is from dnsx https://github.com/projectdiscovery/dnsx/blob/dev/internal/runner/runner.go#L486.

Mzack9999 commented 1 week ago

as @dogancanbakir pointed out, the problem is that the code was implemented in a single threaded way. It needs to be rewritten in multi-threading way, for example with concurrency set to 10 threads using adaptivewaitgroup it would become as follows:

package main

import (
    "fmt"
    "time"

    "github.com/projectdiscovery/dnsx/libs/dnsx"
    sliceutil "github.com/projectdiscovery/utils/slice"
    syncutil "github.com/projectdiscovery/utils/sync"
)

func DnsxEnrich(subdomains []string) ([]string, error) {

    result := sliceutil.NewSyncSlice[string]()

    dnsClient, err := dnsx.New(dnsx.Options{
        BaseResolvers: []string{
            "udp:1.1.1.1:53",         // Cloudflare
            "udp:1.0.0.1:53",         // Cloudflare
            "udp:8.8.8.8:53",         // Google
            "udp:8.8.4.4:53",         // Google
            "udp:9.9.9.9:53",         // Quad9
            "udp:149.112.112.112:53", // Quad9
            "udp:208.67.222.222:53",  // Open DNS
            "udp:208.67.220.220:53",  // Open DNS
        },
        MaxRetries:        2,
        QuestionTypes:     dnsx.DefaultOptions.QuestionTypes,
        Trace:             false,
        TraceMaxRecursion: dnsx.DefaultOptions.TraceMaxRecursion,
        Hostsfile:         false,
        OutputCDN:         false,
    })
    if err != nil {
        fmt.Printf("err: %v\n", err)
        return []string{}, fmt.Errorf("error creating dnsx client: %v", err)
    }

    timeStart := time.Now()

    swg, _ := syncutil.New(syncutil.WithSize(10))

    for _, subdomain := range subdomains {
        swg.Add()
        go func(subdomain string) {
            defer swg.Done()

            res, _ := dnsClient.Lookup(subdomain)

            if len(res) > 1 {
                result.Append(subdomain)
            }
        }(subdomain)
    }

    swg.Wait()

    fmt.Println("\nCompleted in", time.Since(timeStart))
    return result.Slice, nil
}

beware standard slice append operations are not thread safe (so here https://github.com/projectdiscovery/utils/blob/main/slice/sync_slice.go was used). I'm closing the issue as a solution was provided and no changes to the code are required, feel free to comment if anything is unclear. Thanks!