jenkins-infra / stories

A static site for the Jenkins is the way
https://stories.jenkins.io
Apache License 2.0
3 stars 16 forks source link

chore(Jenkins) unify pipelines to only `Jenkinsfile` #16

Closed dduportal closed 1 year ago

dduportal commented 1 year ago

Ref. https://github.com/jenkins-infra/helpdesk/issues/3157

halkeye commented 1 year ago

[2022-11-10T16:41:42.556Z] Killed script returned exit code 137

I'm pretty sure thats a memory issue (OOMKILLER) and not something you did btw

(I'm super curious and just been peeking at what you've been trying to do)

dduportal commented 1 year ago

You are absolutely right. But there is a "peak" of builds on ci.jenkins.io and it behaves weirdly. @smerle33 is trying on in fra.ci with the same pod template (2 CPU + 4 G) we'll see if the OOM is there also.

dduportal commented 1 year ago

Definitively an OOM. But I mean.. 4Gb. What is that task doing that requires so much memory šŸ˜±

Capture dā€™eĢcran 2022-11-10 aĢ€ 18 33 48

dduportal commented 1 year ago

We'll finish monday

halkeye commented 1 year ago

It could be resizing images, but i don't think its that.

https://www.gatsbyjs.com/docs/how-to/performance/resolving-out-of-memory-issues/ has some tips.

The two I can see that might help from this list is:

Try reducing the number of cores

Gatsby defaults to using the number of physical cores on your machine to speed up builds, and parallelizes during certain steps of the process (image processing, Javascript bundle creation, and HTML generation).

So if youā€™re experiencing out of memory issues during any of those steps, you can try setting the environment variable GATSBY_CPU_COUNT to a lower number, like 2. Note that this will slow your builds down!

Increase allocated memory and/or upgrade your hardware.

The default node process in node >= 12 has a 2GB ā€œheap sizeā€ allocated for memory.

Itā€™s plausible that you could increase memory usage significantly. Increasing the max heap size to 4GB or 8GB is quite common (using NODE_OPTIONS=--max-old-space-size set to 4096 or 8192).

The theoretical limit on a 64-bit machine is 16 terabytes, but of course the practical limit is much lower. Netflix experimented with raising this limit to 32GB back in 2014; the Gatsby team is aware of a few sites that run this at 16GB (on machines with 16GB of RAM).

dduportal commented 1 year ago

It could be resizing images, but i don't think its that.

https://www.gatsbyjs.com/docs/how-to/performance/resolving-out-of-memory-issues/ has some tips.

The two I can see that might help from this list is:

Try reducing the number of cores

Gatsby defaults to using the number of physical cores on your machine to speed up builds, and parallelizes during certain steps of the process (image processing, Javascript bundle creation, and HTML generation).

So if youā€™re experiencing out of memory issues during any of those steps, you can try setting the environment variable GATSBY_CPU_COUNT to a lower number, like 2. Note that this will slow your builds down!

Increase allocated memory and/or upgrade your hardware.

The default node process in node >= 12 has a 2GB ā€œheap sizeā€ allocated for memory.

Itā€™s plausible that you could increase memory usage significantly. Increasing the max heap size to 4GB or 8GB is quite common (using NODE_OPTIONS=--max-old-space-size set to 4096 or 8192).

The theoretical limit on a 64-bit machine is 16 terabytes, but of course the practical limit is much lower. Netflix experimented with raising this limit to 32GB back in 2014; the Gatsby team is aware of a few sites that run this at 16GB (on machines with 16GB of RAM).

Thanks a lot @halkeye ! With the variable GATSBY_CPU_COUNT set to the correct amount of vCPUs (e.g. the amount specified for the pod limits) then it works as expected without OOM.

Capture dā€™eĢcran 2022-11-18 aĢ€ 19 26 54 Capture dā€™eĢcran 2022-11-18 aĢ€ 19 27 04

https://github.com/jenkins-infra/jenkins-infra/pull/2487 increased the available resources for the webbuilder pods, and a new updated pipeline is currently being tested with this new resources.

halkeye commented 1 year ago

Not sure why there is a duplicated Jenkinsfile_k8s file, and what the last one containing only "Jenkinsfile" is for (?)

symlink. probalby not useful, but is how the representation in git works.

lemeurherve commented 1 year ago

Not sure why there is a duplicated Jenkinsfile_k8s file, and what the last one containing only "Jenkinsfile" is for (?)

symlink. probalby not useful, but is how the representation in git works.

Oh right, I didn't noticed the corresponding simlink icon next to its filename.

image
dduportal commented 1 year ago

Not sure why there is a duplicated Jenkinsfile_k8s file, and what the last one containing only "Jenkinsfile" is for (?)

symlink. probalby not useful, but is how the representation in git works.

Oh right, I didn't noticed the corresponding simlink icon next to its filename.

image

What @halkeye said: it is the representation of a symlink by GitHub (it's not obvious TBH). The goal is to check the behavior of both ci.jenkins.io and infra.ci without requiring https://github.com/jenkins-infra/kubernetes-management/pull/3193/files to be merged (otherwise it would break the main deployment until the PR here is merged which can take some time)