Closed stan-threatmate closed 8 months ago
Have you tried reproducing with the latest version: https://github.com/projectdiscovery/nuclei/releases/tag/v3.1.10 ?
I looked at the change log but I don't see any memory improvements. I can give it a try. Does the SDK settings look good to you? Am I missing something obvious?
Also this is what it looks like in terms of memory utilization:
You can clearly see when it was killed. This is for an 8GB container.
@stan-threatmate , there was a minor change related to js pooling and upgrade in other pd dependencies so please try with latest version or even dev
if required.
1) memory usage/consumption directly correlates with concurrency & other options . last time i ran on 1.2k targets with default concurrency ( i.e template concurrency 25 , host concurrency 25) . can you try running from sdk with this config ?
2) when there are more than 100 targets i would always recommend using host-spray scan strategy its efficient in many ways
3) can you include pprof(https://pkg.go.dev/net/http/pprof#hdr-Usage_examples) in your code and share profiles for inflection points ( ex: in above graph it would be a profile in (2-3PM) and seconds profile around (3:30PM) ) [ <- these are interesting/required profile locations of above graph but you would have to choose based on resource usage and manually dump this profile from cli using go tool pprof ]
I've been running it all day today with the latest version v3.1.10 but I see the same issues. Also added GOMEMLIMIT=500MiB
and GOGC=20
but still ran out of memory even though the GC started to work pretty hard to clear it. I am about to instrument memory profiling and see if I can get some meaningful data.
Also your suggestions in the comment above contradict this document which I used to set the above options: https://github.com/projectdiscovery/nuclei-docs/blob/main/docs/nuclei/get-started.md
User should select Scan Strategy based on number of targets and Each strategy has its own pros & cons.
- When targets < 1000 . template-spray should be used . this strategy is slightly faster than host-spray but uses more RAM and doesnot optimally reuse connections.
- When targets > 1000 . host-spray should be used . this strategy uses less RAM than template-spray and reuses HTTP connections along with some minor improvements and these are crucial when mass scanning.
Concurrency & Bulk-Size
Whatever the scan-strategy is -concurrency and -bulk-size are crucial for tuning any type of scan. While tuning these parameters following points should be noted.
- If scan-strategy is template-spray
-concurrency < bulk-size (Ex: -concurrency 10 -bulk-size 200)
- If scan-strategy is host-spray
-concurrency > bulk-size (Ex: -concurrency 200 -bulk-size 10)
Can you please provide a recommendation on what settings effect the memory consumption the most and what settings effect the speed of execution? For example I've noticed the rate limit option doesn't really play much of a role in the SDK as reported by the stats which print the RPS. I assume the RPS is the request per second as defined by the rate limit?
I'll do some runs with your suggestion: 25 template and host concurrency. I wish there was a way to understand the system resource utilization based on the settings so we can plan for it based on the number of hosts.
Here is a pprof from a successful run on a smaller scale:
Showing nodes accounting for 421.32MB, 91.26% of 461.68MB total
Dropped 861 nodes (cum <= 2.31MB)
Showing top 50 nodes out of 159
flat flat% sum% cum cum%
64.17MB 13.90% 13.90% 64.17MB 13.90% github.com/projectdiscovery/nuclei/v3/pkg/protocols/common/generators.MergeMaps (inline)
61.90MB 13.41% 27.31% 113.17MB 24.51% fmt.Errorf
50.85MB 11.01% 38.32% 51.71MB 11.20% github.com/projectdiscovery/utils/errors.(*enrichedError).captureStack (inline)
29.22MB 6.33% 44.65% 29.22MB 6.33% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).responseToDSLMap
23.67MB 5.13% 49.78% 23.67MB 5.13% runtime.malg
19.24MB 4.17% 53.94% 78.68MB 17.04% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*requestGenerator).generateRawRequest
16.50MB 3.57% 57.52% 16.51MB 3.58% reflect.New
15.02MB 3.25% 60.77% 15.02MB 3.25% github.com/projectdiscovery/utils/maps.(*OrderedMap[go.shape.string,go.shape.[]string]).Set (inline)
13.11MB 2.84% 63.61% 13.11MB 2.84% net/http.NewRequestWithContext
12.07MB 2.61% 66.22% 13.08MB 2.83% github.com/yl2chen/cidranger.newPrefixTree
12MB 2.60% 68.83% 12.01MB 2.60% github.com/syndtr/goleveldb/leveldb/memdb.New
10.24MB 2.22% 71.04% 10.24MB 2.22% gopkg.in/yaml%2ev2.(*parser).scalar
8.12MB 1.76% 72.80% 30.34MB 6.57% github.com/projectdiscovery/utils/url.ParseURL
7MB 1.52% 74.32% 8.30MB 1.80% github.com/projectdiscovery/utils/reader.NewReusableReadCloser
6.71MB 1.45% 75.77% 6.71MB 1.45% regexp/syntax.(*compiler).inst (inline)
6.64MB 1.44% 77.21% 6.64MB 1.44% strings.(*Builder).grow
5.93MB 1.28% 78.50% 5.93MB 1.28% bytes.growSlice
5.30MB 1.15% 79.64% 29.26MB 6.34% github.com/projectdiscovery/nuclei/v3/pkg/parsers.ParseTemplate
5.18MB 1.12% 80.77% 43.54MB 9.43% github.com/projectdiscovery/nuclei/v3/pkg/templates.parseTemplate
4.91MB 1.06% 81.83% 4.91MB 1.06% bytes.(*Buffer).String (inline)
4.14MB 0.9% 82.73% 4.47MB 0.97% github.com/ulule/deepcopier.getTagOptions
3.67MB 0.8% 83.52% 3.67MB 0.8% reflect.mapassign_faststr0
3.39MB 0.74% 84.26% 10.92MB 2.37% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http/httpclientpool.wrappedGet
3.38MB 0.73% 84.99% 7.92MB 1.72% github.com/projectdiscovery/nuclei/v3/pkg/protocols/utils.generateVariables
3.25MB 0.7% 85.69% 3.25MB 0.7% github.com/projectdiscovery/nuclei/v3/pkg/protocols/utils.GenerateDNSVariables
2.97MB 0.64% 86.34% 2.97MB 0.64% github.com/projectdiscovery/retryablehttp-go.DefaultReusePooledTransport
2.82MB 0.61% 86.95% 2.82MB 0.61% github.com/projectdiscovery/ratelimit.(*Limiter).Take
2.78MB 0.6% 87.55% 3.74MB 0.81% github.com/projectdiscovery/nuclei/v3/pkg/model/types/stringslice.(*StringSlice).UnmarshalYAML
Note on the second line how much the fmt.Errorf
takes. I expect a ton of errors as show by the stats:
[0:17:35] | Templates: 3891 | Hosts: 8 | RPS: 141 | Matched: 2 | Errors: 144915 | Requests: 149322/159840 (93%)
Also this stat is printed after nuclei has finished but it shows 93% and the stat never stops printing.
A profile of a more intense run:
top50
Showing nodes accounting for 1118.57MB, 92.07% of 1214.85MB total
Dropped 1016 nodes (cum <= 6.07MB)
Showing top 50 nodes out of 139
flat flat% sum% cum cum%
356.81MB 29.37% 29.37% 356.81MB 29.37% github.com/projectdiscovery/nuclei/v3/pkg/protocols/common/generators.MergeMaps (inline)
95.77MB 7.88% 37.25% 95.77MB 7.88% runtime.malg
75.52MB 6.22% 43.47% 313.64MB 25.82% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*requestGenerator).generateRawRequest
68.30MB 5.62% 49.09% 68.38MB 5.63% net/http.NewRequestWithContext
58.39MB 4.81% 53.90% 58.39MB 4.81% github.com/projectdiscovery/utils/maps.(*OrderedMap[go.shape.string,go.shape.[]string]).Set (inline)
57.80MB 4.76% 58.66% 63.10MB 5.19% github.com/ulule/deepcopier.getTagOptions
42.65MB 3.51% 62.17% 135.82MB 11.18% github.com/projectdiscovery/utils/url.ParseURL
29.94MB 2.46% 64.63% 48.56MB 4.00% fmt.Errorf
28.74MB 2.37% 67.00% 28.74MB 2.37% net/textproto.MIMEHeader.Set (inline)
27.06MB 2.23% 69.22% 32.54MB 2.68% github.com/projectdiscovery/utils/reader.NewReusableReadCloser
19.45MB 1.60% 70.83% 19.45MB 1.60% bytes.(*Buffer).String (inline)
18.97MB 1.56% 72.39% 18.99MB 1.56% strings.(*Builder).grow
18.63MB 1.53% 73.92% 45.56MB 3.75% github.com/projectdiscovery/nuclei/v3/pkg/protocols/utils.generateVariables
17.99MB 1.48% 75.40% 18.06MB 1.49% github.com/projectdiscovery/nuclei/v3/pkg/protocols/utils.GenerateDNSVariables
17.13MB 1.41% 76.81% 17.15MB 1.41% reflect.New
17.02MB 1.40% 78.21% 21.78MB 1.79% github.com/projectdiscovery/utils/errors.(*enrichedError).captureStack (inline)
13.82MB 1.14% 79.35% 13.82MB 1.14% github.com/projectdiscovery/ratelimit.(*Limiter).Take
13.58MB 1.12% 80.47% 13.60MB 1.12% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).responseToDSLMap
12.81MB 1.05% 81.52% 13.82MB 1.14% github.com/yl2chen/cidranger.newPrefixTree
12.65MB 1.04% 82.56% 116.63MB 9.60% github.com/projectdiscovery/retryablehttp-go.NewRequestFromURLWithContext
12MB 0.99% 83.55% 12.01MB 0.99% github.com/syndtr/goleveldb/leveldb/memdb.New
11.89MB 0.98% 84.53% 82.14MB 6.76% github.com/projectdiscovery/utils/url.absoluteURLParser
11.31MB 0.93% 85.46% 11.31MB 0.93% github.com/projectdiscovery/utils/maps.NewOrderedMap[go.shape.string,go.shape.[]string] (inline)
10.92MB 0.9% 86.36% 11.39MB 0.94% github.com/projectdiscovery/utils/url.NewOrderedParams (inline)
10.28MB 0.85% 87.21% 10.28MB 0.85% gopkg.in/yaml%2ev2.(*parser).scalar
7.35MB 0.61% 87.81% 731.95MB 60.25% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*requestGenerator).Make
7.33MB 0.6% 88.41% 7.33MB 0.6% bytes.growSlice
7.03MB 0.58% 88.99% 7.03MB 0.58% regexp/syntax.(*compiler).inst (inline)
5.40MB 0.44% 89.44% 29.48MB 2.43% github.com/projectdiscovery/nuclei/v3/pkg/parsers.ParseTemplate
5.31MB 0.44% 89.88% 55.44MB 4.56% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*requestGenerator).generateHttpRequest
5.08MB 0.42% 90.29% 43.94MB 3.62% github.com/projectdiscovery/nuclei/v3/pkg/templates.parseTemplate
4.61MB 0.38% 90.67% 30.72MB 2.53% fmt.Sprintf
3.31MB 0.27% 90.95% 11.10MB 0.91% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http/httpclientpool.wrappedGet
2.81MB 0.23% 91.18% 21.36MB 1.76% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http/raw.readRawRequest
2.77MB 0.23% 91.40% 6.43MB 0.53% github.com/projectdiscovery/nuclei/v3/pkg/protocols/common/replacer.Replace
2.56MB 0.21% 91.62% 74.73MB 6.15% github.com/projectdiscovery/utils/url.(*OrderedParams).Decode
1.03MB 0.085% 91.70% 85.59MB 7.05% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).executeRequest
0.87MB 0.072% 91.77% 8.20MB 0.67% bytes.(*Buffer).grow
0.67MB 0.055% 91.83% 9.21MB 0.76% regexp.compile
0.60MB 0.049% 91.88% 841.32MB 69.25% github.com/projectdiscovery/nuclei/v3/pkg/tmplexec/generic.(*Generic).ExecuteWithResults
0.58MB 0.047% 91.92% 7.34MB 0.6% github.com/projectdiscovery/retryablehttp-go.NewClient
0.51MB 0.042% 91.97% 72.52MB 5.97% net/http.(*Transport).dialConn
0.50MB 0.041% 92.01% 20.95MB 1.72% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).Compile
0.29MB 0.024% 92.03% 841.64MB 69.28% github.com/projectdiscovery/nuclei/v3/pkg/tmplexec.(*TemplateExecuter).Execute
0.12MB 0.01% 92.04% 69.88MB 5.75% github.com/projectdiscovery/fastdialer/fastdialer.(*Dialer).DialTLS
0.11MB 0.0092% 92.05% 45.04MB 3.71% github.com/projectdiscovery/nuclei/v3/pkg/templates.Parse
0.09MB 0.0075% 92.06% 65.17MB 5.36% github.com/projectdiscovery/fastdialer/fastdialer.AsZTLSConfig
0.08MB 0.0068% 92.06% 809.25MB 66.61% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).executeParallelHTTP
0.07MB 0.0055% 92.07% 6.47MB 0.53% net/http.(*persistConn).writeLoop
0.06MB 0.0051% 92.07% 69.73MB 5.74% github.com/projectdiscovery/nuclei/v3/pkg/catalog/loader.(*Store).LoadTemplatesWithTags
And this uses what you suggested 25 template/host concurrency:
(pprof) top50
Showing nodes accounting for 1191.23MB, 92.63% of 1285.96MB total
Dropped 960 nodes (cum <= 6.43MB)
Showing top 50 nodes out of 136
flat flat% sum% cum cum%
220.54MB 17.15% 17.15% 403.46MB 31.37% fmt.Errorf
182.10MB 14.16% 31.31% 187.27MB 14.56% github.com/projectdiscovery/utils/errors.(*enrichedError).captureStack (inline)
177.04MB 13.77% 45.08% 177.04MB 13.77% github.com/projectdiscovery/nuclei/v3/pkg/protocols/common/generators.MergeMaps (inline)
104.67MB 8.14% 53.22% 104.82MB 8.15% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).responseToDSLMap
74.08MB 5.76% 58.98% 81.28MB 6.32% github.com/ulule/deepcopier.getTagOptions
70.54MB 5.49% 64.46% 70.54MB 5.49% runtime.malg
50.89MB 3.96% 68.42% 208.12MB 16.18% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*requestGenerator).generateRawRequest
40.33MB 3.14% 71.56% 40.33MB 3.14% github.com/projectdiscovery/utils/maps.(*OrderedMap[go.shape.string,go.shape.[]string]).Set (inline)
34.22MB 2.66% 74.22% 34.22MB 2.66% net/http.NewRequestWithContext
22.66MB 1.76% 75.98% 22.66MB 1.76% bytes.growSlice
21.43MB 1.67% 77.65% 80.65MB 6.27% github.com/projectdiscovery/utils/url.ParseURL
18.23MB 1.42% 79.06% 21.67MB 1.69% github.com/projectdiscovery/utils/reader.NewReusableReadCloser
17.32MB 1.35% 80.41% 17.32MB 1.35% bytes.(*Buffer).String (inline)
17.03MB 1.32% 81.74% 17.03MB 1.32% reflect.New
14.07MB 1.09% 82.83% 14.07MB 1.09% strings.(*Builder).grow
12.20MB 0.95% 83.78% 13.44MB 1.05% github.com/yl2chen/cidranger.newPrefixTree
12MB 0.93% 84.71% 12.02MB 0.93% github.com/syndtr/goleveldb/leveldb/memdb.New
10.28MB 0.8% 85.51% 10.28MB 0.8% gopkg.in/yaml%2ev2.(*parser).scalar
9.84MB 0.77% 86.28% 201.71MB 15.69% fmt.Sprintf
9.06MB 0.7% 86.98% 21.64MB 1.68% github.com/projectdiscovery/nuclei/v3/pkg/protocols/utils.generateVariables
9.02MB 0.7% 87.68% 9.02MB 0.7% github.com/projectdiscovery/nuclei/v3/pkg/protocols/utils.GenerateDNSVariables
7.67MB 0.6% 88.28% 558.04MB 43.39% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).executeRequest
7.21MB 0.56% 88.84% 7.21MB 0.56% strings.genSplit
6.75MB 0.52% 89.36% 6.75MB 0.52% github.com/projectdiscovery/ratelimit.(*Limiter).Take
6.75MB 0.52% 89.89% 6.75MB 0.52% regexp/syntax.(*compiler).inst (inline)
5.39MB 0.42% 90.31% 63.35MB 4.93% github.com/projectdiscovery/retryablehttp-go.NewRequestFromURLWithContext
5.25MB 0.41% 90.72% 53.95MB 4.20% github.com/projectdiscovery/utils/url.absoluteURLParser
5.20MB 0.4% 91.12% 29.08MB 2.26% github.com/projectdiscovery/nuclei/v3/pkg/parsers.ParseTemplate
4.90MB 0.38% 91.50% 43MB 3.34% github.com/projectdiscovery/nuclei/v3/pkg/templates.parseTemplate
3.44MB 0.27% 91.77% 370.45MB 28.81% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*requestGenerator).Make
3.27MB 0.25% 92.02% 7.54MB 0.59% net/http.(*Client).do.func2
3.13MB 0.24% 92.27% 10.94MB 0.85% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http/httpclientpool.wrappedGet
1.68MB 0.13% 92.40% 48.86MB 3.80% github.com/projectdiscovery/utils/url.(*OrderedParams).Decode
0.55MB 0.043% 92.44% 8.64MB 0.67% regexp.compile
0.54MB 0.042% 92.48% 7.26MB 0.56% github.com/projectdiscovery/retryablehttp-go.NewClient
0.52MB 0.041% 92.52% 100.17MB 7.79% net/http.(*Transport).dialConn
0.52MB 0.04% 92.56% 20.74MB 1.61% github.com/projectdiscovery/nuclei/v3/pkg/protocols/http.(*Request).Compile
0.19MB 0.015% 92.58% 22.85MB 1.78% bytes.(*Buffer).grow
0.13MB 0.01% 92.59% 83.98MB 6.53% github.com/projectdiscovery/fastdialer/fastdialer.AsZTLSConfig
0.12MB 0.0096% 92.60% 97.36MB 7.57% github.com/projectdiscovery/fastdialer/fastdialer.(*Dialer).DialTLS
0.08MB 0.0064% 92.60% 431.91MB 33.59% github.com/projectdiscovery/nuclei/v3/pkg/tmplexec/generic.(*Generic).ExecuteWithResults
0.08MB 0.0061% 92.61% 44.34MB 3.45% github.com/projectdiscovery/nuclei/v3/pkg/templates.Parse
0.06MB 0.0049% 92.62% 68.69MB 5.34% github.com/projectdiscovery/nuclei/v3/pkg/catalog/loader.(*Store).LoadTemplatesWithTags
0.05MB 0.0037% 92.62% 194.20MB 15.10% github.com/projectdiscovery/utils/errors.(*enrichedError).Error
0.04MB 0.0034% 92.62% 21.30MB 1.66% net/http.(*persistConn).writeLoop
0.04MB 0.0032% 92.63% 17.75MB 1.38% gopkg.in/yaml%2ev2.(*decoder).sequence
0.04MB 0.0027% 92.63% 431.95MB 33.59% github.com/projectdiscovery/nuclei/v3/pkg/tmplexec.(*TemplateExecuter).Execute
0.03MB 0.0021% 92.63% 196.37MB 15.27% net/url.(*Error).Error
0.02MB 0.0015% 92.63% 425.83MB 33.11% github.com/projectdiscovery/retryablehttp-go.(*Client).Do
0.01MB 0.00092% 92.63% 97.23MB 7.56% github.com/projectdiscovery/fastdialer/fastdialer.(*Dialer).dial
Actually your proposal of having 25 concurrent hosts/templates worked on one of my test setups. I setup a constrained memory container with 2048MB RAM and aggressive GC settings: GOMEMLIMIT=500MiB
and GOGC=20
. I saw when the scan reached 35% the ram increased suddenly and the GC was trying really hard to free the memory. It got right to 2GB and stayed there for a bit. I thought it will be OOM killed but it managed to keep the pace of allocs/frees so it didn't get killed and it went down to a sustainable levels.
Then around the 75% complete it shot up again this time staying for a very long time at 2GB and the CPU really hurting at 1500% trying to free all this memory. It was successful and completed at the end.
My theory is that some templates are allocating a ton of memory and if the concurrency settings are above a certain threshold it can lead to the allocation rate surpassing the ability of the GC to free the memory which ultimately leads to an OOM kill. The only saving grace would be good amount of free memory and/or fast CPUs that can help the GC free up memory faster. But we really need guidance on the performance characteristics of nuclei: what is the RAM and CPU requirements for X many hosts and Y many templates etc.
Is there a way to know which templates use the most memory? Can we measure the CPU/memory of individual templates? That would be a very good metric to know. If I want to speed things up I'd like to be able to run efficient templates faster but slow down on the memory heavy ones in order to not run out of memory. Some of the big allocations are around the raw requests and responses.
Another update - using 25/25 for host/template concurrency when scanning 360 targets still resulted in OOM kill but it ran for a significantly longer time. I will set the garbage collection to aggressive values and try again: GOGC=20 GOMEMLIMIT=500MiB
@stan-threatmate , although it is helpful in production but tuning GC while debugging memory leaks might not help so i would recommending just to try out with normal options . because as you already know go does not immediately release memory but it gradually releases it in hope of reusing it instead of allocating it again and again .
that is why tuning GC aggresively would only cause more cpu utilization without any direct output (especially in this case )
Just now i have added more docs on how nuclei consumes resources and all the factors involved based on your suggestion at https://docs.projectdiscovery.io/tools/nuclei/mass-scanning-cli
From the profile details you have shared it looks like these are not the actual inflection points
Showing nodes accounting for 1191.23MB, 92.63% of 1285.96MB total <- heap memory is 1.2GB
Also looking at above profile data i can only tell that the top functions using heap as shown in above profiles are expected . generator.MergeMaps , generateRawRequest etc contain raw response data
in maps and looking at concurrency i think this much is expected and since this data is actually obtained from targets being run its difficult to estimate how much data is currently being held .
If you think its related to particular set of templates i would suggest splitting templates and running different scan
protocolType
option in nuclei sdk to effectively filter out templates. If the problem is related to a specific set of template or a feature related to a specific template . it will be visible to you in one of the above scan results / observations^ This is one of effective strategy i used which worked when i fixed memory leaks recently in v3.1.8-10
Other/Alternative strategy is to continiously capture snapshots and nuclei process memory (using memstats or manually via bash script using PID) . subtracting profiles between normal and sudden spike using -diff_base
will pinpoint the function responsible for it
I will try to reproduce this using CLI with 800 targets.
Finally if you want to customize any execution behaviour or use your own logic . i would suggest taking a look at core
package which contains how targets x templates are executed
btw nuclei-docs repo is deprecated and latest docs are available at https://docs.projectdiscovery.io/introduction
I'll continue debugging this but here is a run on a 16GB container with 8 CPUs which is scanning 358 hosts with 25/25 host/template concurrency and max host error set to 15000:
You can clearly see that there is some event that causes a runaway memory utilization at the end of the scan. At this point the nuclei stats showed 43% completion but I am not sure how trustworthy that percentage is.
You can see the GC working really hard throughout the scan to keep the memory low.
@tarunKoyalwar thank you for looking into this issue!
I have a question about -max-host-error
. I want to use nuclei to try all templates on a host. If I understand this correctly we need to set the mhe to a large number in order to not stop the scan prematurely, right?
Also about the-response-size-read
option - do templates care about this value and if I set it 0 to save on memory would that hurt how templates work?
About the -rate-limit
option - I haven't seen it make any difference at least according to the nuclei stats. Is the RPS metric reported by the stats controlled by this option?
Update: I am scanning 47 hosts with the following settings and I stil get OOM killed on a 16GB RAM 8 CPU container:
GOMEMLIMIT=500MiB GOGC=20
I suspect a template or a group of templates that spike up and cause large memory allocations because the memory allocations are stable until an inflection point where things spike.
The steep climb is what makes me believe it is a very specific template or related templates that cause this.
I have the same problem. And after revert to v. 2.9.15 everything works well. So I think that the problem is with one of 118 templates, which is not supported in v. 2.9.15
I can confirm that it has to be one of the critical templates. Here are two scans. The first is using only the critical severity templates. The second is using everything but the critical severity templates:
We can see when we don't do the critical severity templates the memory is minimal.
@stan-threatmate FYI , we were able to reproduce this some time ago and working on locating and fixing the vulnerable code
@tarunKoyalwar thank you!
The issue will be fixed as part of https://github.com/projectdiscovery/nuclei/issues/4800
@stan-threatmate , can you try running scan using sdk by disabling these 4 templates ? you can use -et
and itscorresponding option in sdk to disable these templates
http/cves/2019/CVE-2019-17382.yaml
http/cves/2023/CVE-2023-24489.yaml
http/fuzzing/header-command-injection.yaml
http/fuzzing/wordpress-weak-credentials.yaml
@tarunKoyalwar I removed the templates you mention and my first test scan finished successfully. Memory looks great. Next I am running a large scan over 400 hosts but it will take 15h to complete so I'll report tomorrow. I also used more aggressive settings:
Removing the 5 templates allowed us to scan about 400 hosts with no problem on a 16GB container with 8 CPUs
@stan-threatmate The issue is about high parallelism in bruteforce templates, which causes a lot of buffer allocations to read http responses (up to default 10mb). To mitigate the issue a generic memory monitor mechanism, has been implemented in https://github.com/projectdiscovery/nuclei/pull/4833 (when the global RAM occupation is above 75%
the parallelism is decreased to 5
), I was able to complete multiple runs without the scan being killed on an 8gb system.
@Mzack9999 thank you! How is the RAM limit determined? Is it based on the free memory or the total memory? Can we configure the limits (75% and 5 threads) in the SDK?
Update: I looked at your changes and added some comments.
Second update: Another mechanism you can use is a rate limit on the memory allocations per second. If 10MB buffers can be allocated we can limit the buffer allocations per second to 50 for 500MB of RAM per second. Ideally this will be configurable.
Nuclei version:
3.1.8
Current Behavior:
I run the nuclei SDK as part of a binary that is deployed in a linux container (alpine) with memory limits of 8GB and 16GB. I use the standard templates. In both cases it gets OOM killed. Here are the settings I specify.
I tried this with 115 and 380 hosts and both are having memory issues. What is causing the high memory utilization? I am saving the results from the nuclei scan in a list. Could the results be so large that they fill in the memory?
I run nuclei like this:
Expected Behavior:
The nuclei SDK should trivially handle scanning hosts with the above settings. It will be great to have an example of the SDK settings that match the default nuclei cli scan settings.
What would be the equivalent settings for the SDK?
Additionally what settings in the SDK control the memory utilization? It will be good to document those as well.
Steps To Reproduce:
Use the above settings and set up a scan. Watch it take a lot of memory over time. Better if you use 115 (or more) web sites.
Anything else: