Open iakov opened 10 months ago
Thanks for the report.
Is this a custom runner image? CentOs7 is not listed at https://github.com/actions/runner-images.
To set expectations right: I can't provide any support for this.
that introduces node20 dependency
you probably saw the merge commit, the change is here: https://github.com/hendrikmuhs/ccache-action/pull/177
The reason is this.
It is not a "minor fix". You are right about classification, however I would also not make a big deal out of it. It should be totally fine for you to stay on version 1.2.11. I don't have the capacity to maintain 2 versions, especially as node 16 is EOL anyway. You might as well fix it on your side if you are using a custom runner.
Anyway, can you explain your setup in more detail and if possible provide logs?
Are you effected by this? Can you confirm that you can execute node 18?
If so, downgrading to node 18 might be the better alternative. I would love to avoid maintaining 2 versions.
Also occurs with workflows utilizing older docker images.
See https://github.com/danmar/cppcheck/actions/runs/7677304162/job/20926809147?pr=5917 https://github.com/danmar/cppcheck/pull/5918
It also seems like this caused a major slowdown in the Post ccache
step - it now takes way above 2 minutes instead of being of a few seconds:
Before: https://github.com/danmar/cppcheck/actions/runs/7646437504/job/20835239981 After: https://github.com/danmar/cppcheck/actions/runs/7681894296/job/20935549647?pr=5918
The jobs fixated on the 1.2.11
version are not experiencing this and are much faster with way less data to cache:
https://github.com/danmar/cppcheck/actions/runs/7681894302/job/20935550032?pr=5918
@firewave
You're 1st problem is similar to the reported one, you are running it inside a custom image. That image seems to lack node 20. It would be nice if you could find out if node 18 would work.
It would be technically better if you run the action outside of docker(on the official github runner) and use docker only for building. This requires exposing the host fs and setting the ccache path accordingly like it is done here.
There is no alternative to an upgrade of the nodejs version, node 16 is EOL as clearly stated by github. However we are flexible w.r.t. upgrading it to 18 vs. 20.
Your 2nd problem: It ran slower, because it did not found a cache. This seems like a different problem, which might or might not be caused by the upgrade of nodejs. For such cases it is best to let it run a 2nd time to see if the problem persists. It might be that the underlying github cache layer puts something more into the key. A 2nd run should get the cache from the 1st run.
@hendrikmuhs , we use containers for CentOS:7. Imho, there is no need to maintain two versions. Please, do not introduce such a dramatical change with the patch version increment next time :) Probably, there is no action actually needed ATM, because all poor projects, unhappy with the nodejs hidden version shift will stick with the 1.2.11 after reading this thread.
@iakov
As said, there has never be a guarantee that custom runners work. All official github runners support node 20.
Repeating my question: Would node 18 work? - if you or anyone else affected can answer this question, I happily change this, otherwise it stays at it is. I don't have the resources to test this myself.
Thank you. Node18 does not help. Node18 requires newer GLIBC and is incompatible with Ubuntu 18.04 and other obsolete/EOL releases. The only thing I ask you is about versioning. We are happy to use unsupported older 1.2 version of your action as we do with checkout@v3
, it works fine. I kindly as you to tag release as v2 or at least v1.3 for new features with new incompatible dependencies. However, maybe those few projects with broken CI and CD will come to this thread for solution and choose to stick with 1.2.11
until build server update.
I second to start a new version for this change and leave v1.2
on Node 16.
Your 2nd problem: It ran slower, because it did not found a cache. This seems like a different problem, which might or might not be caused by the upgrade of nodejs. For such cases it is best to let it run a 2nd time to see if the problem persists. It might be that the underlying github cache layer puts something more into the key. A 2nd run should get the cache from the 1st run.
It did find a cache. Otherwise it would not have been able to restore that and have any hits. Here's the same results from a main
branch build after the version has been fixated for the docker-based workflows:
1.2.11: https://github.com/danmar/cppcheck/actions/runs/7687753684/job/20948156884 1.2.12: https://github.com/danmar/cppcheck/actions/runs/7687753675/job/20948156975
The stats are different because different ccache
version are being used. So I wondered if this might just be an issue with the newer version of that but that is refuted by a build on a newer platform which also uses 1.2.11:
https://github.com/danmar/cppcheck/actions/runs/7687753684/job/20948157383
More 1.2.12 cases on various runners without docker involved: https://github.com/danmar/cppcheck/actions/runs/7687753687
I do assume this is actually a Node issue and I will file an issue upstream. We should also track this in a different issue within this repo.
@firewave
You might have a problem with cache eviction as I see many many jobs. Back to your initial post:
It did find a cache. Otherwise it would not have been able to restore that and have any hits.
This is from your run:
Run hendrikmuhs/ccache-action@v1.2
Install ccache
Restore cache
No cache found.
Configure ccache, linux
Source:
After: https://github.com/danmar/cppcheck/actions/runs/7681894296/job/20935549647?pr=5918
Update:
Or do you mean: It finds the cache now? I am not sure I fully understand. What we know so far is that after the update, the cache was not found.
The inner workings of the cache, meaning the github backend side (not this action, not ccache), are best to my knowledge not open/public source. I found other cases where it behaved strangely, maybe the node version is somehow backed into the key in an implicit way. We also had similar behaviour after github upgraded the client libraries. In summary it happens that a cache entry is for some reason missing, but usually it fixes itself with consecutive runs. This is just the usual SLA of a cache, but as long as it works most of the time, this seems fine.
The parallel issue https://github.com/actions/runner/issues/2906 reports some success after downgrade, however https://github.com/nodejs/node/blob/v18.x/BUILDING.md#platform-list clearly states the newer glibc for x64
.
In general I advise not to use tag sliding. I slide the tag on release, but I consider this bad practice. The suggested practice is an explicit version in combination with dependabot.
The action has very little QA coverage especially when it comes to exotic setups.
I am seeing the same issue with older Ubuntu docker images. Version @1.2.11 seems to work, but @v1 fails.
https://github.com/robotraconteur/robotraconteur/actions/runs/7691430275
Alright, I think here is what we should do:
v1
and v1.2
If you want to help:
Alright, I think here is what we should do:
- revert the switch to node 20 back to 16 and release it as 1.2.13 with tags for
v1
andv1.2
- add a note in the Readme
- jump to node 18 on main and release it as v2 and go forward from there
Agreed!
If you want to help:
- it would be nice to setup a test that executes the action in a custom environment, e.g. an ubuntu 16.04 container
Can do.
I am ready to test new tags for obsolete builders. You are free to send a PRs with changes in this line https://github.com/trikset/trikRuntime/blob/master/.github/workflows/centos.yml#L73 , or I can do it by request.
Even with fixating to version 1.2.11
the two minute hang is now present because the underlying runners have all been updated to Node 20.
Thus, this action fails to work where it was used via
@v1.2
. Please, remove this release or re-release asv1.2.14
the backward-compatible version. The problem is hidden in thecache
action version 3.2.3 (PR #175), that introducesnode20
dependency. I suggest to re-release the update asv1.3
or evenv2
, because actually It is not a "minor fix".