Node 20 appears to cause major slowdown in `Post ccache` step

firewave commented 9 months ago

The update to Node 20 appears to cause a slowdown in the Post ccache step.

In my case the step was taking single-digit seconds with the version used on Node 16 and always 2+ minutes when Node 20 was involved. The used platform or prior existence of a cache does not seem to have any impact.

See https://github.com/hendrikmuhs/ccache-action/issues/178#issuecomment-1913376963 https://github.com/hendrikmuhs/ccache-action/issues/178#issuecomment-1913745651

firewave commented 9 months ago

Continuing from #178 here.

My previous examples were flawed and your observations seemed correct. But let's look at a better example

https://github.com/danmar/cppcheck/actions/workflows/valgrind.yml?query=branch%3Amain.

This seems to be a good slice as it contains both versions and also both states.

The latest builds is descending order:

https://github.com/danmar/cppcheck/actions/runs/7644329997/job/20828411569 - 1.2.12 / cache found -> 2+ minutes https://github.com/danmar/cppcheck/actions/runs/7687753686/job/20948157176 - 1.2.12 / cache found -> 2+ minutes https://github.com/danmar/cppcheck/actions/runs/7687504907/job/20947611473 - 1.2.12 / no cache found -> 2+ minutes https://github.com/danmar/cppcheck/actions/runs/7646437494/job/20835238755 - 1.2.11 / cache found -> 3 seconds https://github.com/danmar/cppcheck/actions/runs/7644329997/job/20828411569 - 1.2.11 / no cache found -> 3 seconds

And some random builds from the past (unfortunately I did not find one with no cache): https://github.com/danmar/cppcheck/actions/runs/7507173423/job/20440160792 - cache found / 5 seconds https://github.com/danmar/cppcheck/actions/runs/7233590995/job/19709031614 - cache found / 4 second

chirag-droid commented 9 months ago

I can take up this issue.

It is happening due to how Node 20 works

chirag-droid commented 9 months ago

I think this issue needs to be fixed by @actions/ccache package or by NodeJS implementing a fix?

firewave commented 9 months ago

I have no idea how Node works and what API it offers - but I think the proper fix would be in the code which is doing the HTTP request, telling the connection to be closed via the Connection: close HTTP header.

I am very vaguely remembering encountering some similar issue with keep-alive in Java quite a while ago.

chirag-droid commented 9 months ago

I have opened the issue with https://github.com/actions/toolkit/issues/1643

I am surprised many packages inside https://github.com/actions are using process.exit so the action doesn't hang.

I will look into this if I can fix it.

hendrikmuhs commented 9 months ago

Related to the other thread: Would a downgrade to node 18 help? We don't have to use node 20, all we need is not 16, because github put it EOL. In the long run we have to upgrade of course, but a downgrade would buy some time so github folks can figure out whats wrong.

firewave commented 9 months ago

Related to the other thread: Would a downgrade to node 18 help?

Just updating to Node 18 should help with this issue since the behavior has only changed in Node 19.

We don't have to use node 20, all we need is not 16, because github put it EOL.

Actually we do: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

But as you commented in the other ticket let's do this in incremental steps one problem at a time...

chirag-droid commented 9 months ago

actions/toolkit seems unresponsive about issues and prs

Crzyrndm commented 8 months ago

Any chance of #182 getting merged, even as a temporary fix? The current situation is unpleasant and there doesn't seem to be any activity from GH to resolve the root cause

hendrikmuhs commented 7 months ago

I can merge it once the test-failures are resolved (maybe it is just a matter of merging the latest changes).

mweberxyz commented 7 months ago

GitHub should be able to fix this. I identified the exact connection they are leaving open so hopefully every downstream consumer doesn't need to implement the process.exit hack: https://github.com/actions/toolkit/issues/1702

As @chirag-droid said though.. they don't seem to take community PRs or issues at very high priority 😕

Naros commented 6 months ago

Hi all, I've noticed that the post-ccache takes between 2.5 and 3 minutes while the restore takes 10-15 seconds. Is there any way I can speed this up on our builds?

junaire commented 6 months ago

Hi, is there any workaround about this?

hendrikmuhs / ccache-action

Node 20 appears to cause major slowdown in `Post ccache` step #181