Closed l0gicgate closed 7 months ago
@l0gicgate , The logs show that there was a matching cache :
The next step is to download the contents of the cache from the archiveLocation. That has failed in the run. It is not easy to guess the cause of failure with the logs shared here. Most commonly, your blob storage would have failed to return the cached file. It might help to check your blob storage configuration and access as a next step.
Update on the root cause of this issue:
TL;DR GitHub Actions is configured with AWS S3 storage with Server Side Encryption via KMS (SSE-KMS). GitHub Enterprise Server uses presigned URLs to download the cache. SSE-KMS requires using AWS Signature Version 4 to download the content from S3 that GitHub Enterprise Server 3.7.1 does not support
Potential Solutions
Long Version
It is difficult to find the actual download URL because actions/cache
sets it as a secret that Actions will mask and the metadata URL uses the /_services
endpoint which I could not find any documentation on how to query (I just get a 403 Forbidden error when I try to hit it from the administrative shell)
I did however find some weird behavior after logging in via the administrative shell.
I ran ghe-actions-check -s blob
and got the following:
Storage tests finished, output is below:
============================================================
LR actions> Creating new log file /LR/Logs/Actions_OnPrem_Test-StorageConnection_***.log
LR actions> Test-Storage Connection utility for GitHub Enterprise
LR actions> Effective remote blob provider with [s3].
LR actions> Configured Blob provider is : [Microsoft.VisualStudio.Services.Cloud.AmazonS3BlobProvider]
LR actions> 1. Testing Upload content : Passed
LR actions>
LR actions> 2. Testing Download content : Passed
LR actions>
LR actions> 3. Testing Direct Download content : Failed
LR actions>
LR actions>
LR actions>
LR actions> 4. Testing Delete content : Passed
LR actions>
LR actions> 5. Testing MultiPart Upload content : Passed
LR actions>
LR actions> 6. Testing Delete large content : Passed
LR actions>
LR actions>
LR actions> Warning! DirectDownload (Download object using presigned URL) Test failed. This failure will not affect configuring Actions. Some features (like Actions cache) might not work as expected.
LR actions>
LR actions>
LR actions> Storage tests passed with warnings.
LR actions>
============================================================
Blob Storage is healthy!
After some diving into what this is doing, I did ghe-actions-console -s actions -c 'Test-StorageConnection -TreatWarningAsErrors -Debug'
and got:
Creating new log file /LR/Logs/Actions_OnPrem_Test-StorageConnection_***.log
Test-Storage Connection utility for GitHub Enterprise
DEBUG: Loading blob provider configuration from file
Effective remote blob provider with [s3].
DEBUG: Blob provider [s3] with container prefix [***] is configured on this instance.
Configured Blob provider is : [Microsoft.VisualStudio.Services.Cloud.AmazonS3BlobProvider]
...
DEBUG: Acquiring directdownload URL for file [***] from container [***]
DEBUG: Received direct download uri : https://***/actions-***?AWSAccessKeyId=***&Expires=***&Signature=*** for [***] in container ***. Downloading content..
DEBUG: Pre-signed URL Response code : BadRequest, Content: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidArgument</Code><Message>Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4.</Message><ArgumentName>Authorization</ArgumentName><ArgumentValue>null</ArgumentValue><RequestId>***</RequestId><HostId>***</HostId></Error> for [***] in container ***
...
Test-StorageConnection: 1. Testing Upload content : Passed
2. Testing Download content : Passed
3. Testing Direct Download content : Failed
4. Testing Delete content : Passed
5. Testing MultiPart Upload content : Passed
6. Testing Delete large content : Passed
Storage tests failed, please see output for more details.
Putting the warning message together with the test failure points reasonably clearly to the root cause being S3 SSE-KMS and GitHub Enterprise not using AWS Signature V4. Either updating GHES or using SSE-S3 might resolve the issue but I haven't had a chance to test this. Will update this thread once I have tested these options
Confirmed that downgrading to AWS S3-SSE for the Actions S3 bucket fixes the issue Upgrading to GitHub Enterprise Server 3.7.17 did not resolve the issue
Closing as resolved. See @elliottpope's comment for anyone else experiencing this issue.
Hello, We also encountered this issue (GHES 3.8, using SSE-S3): it was caused by not giving enough S3 authorizations to the GHES instance (various read and write permissions). Just for testing: create a test S3 bucket, give all the permissions to the GHES instance, and configure the GHES instance to use this bucket in the "Actions" settings. It solved the issue for us. Unfortunately, I don't know yet which exact permissions are needed for the instance to work (we gave more permissions than needed). It took us months (!) to find the root cause. I guess various origins can cause this issue: this log message doesn't offer any help in finding a solution, it ought to be more detailed.
Edit: Adding https://github.com/actions/cache/issues/1192 and https://github.com/actions/cache/issues/1052 for reference.
A little late but in case others see this, you can debug this with the following check:
ghe-actions-check -s blob
If the Testing Direct Download content
check fails with the following error on the download, then it's a signature version mismatch:
Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4.
Actions has a feature flag that will force v4
that can be flipped on the Actions and ArtifactCache services running on GHES:
ghe-actions-console -s artifactcache --command 'Set-FeatureFlag -FeatureName Microsoft.VisualStudio.Services.Cloud.AmazonS3.ForceSigVersion4 -State On'
ghe-actions-console -s actions --command 'Set-FeatureFlag -FeatureName Microsoft.VisualStudio.Services.Cloud.AmazonS3.ForceSigVersion4 -State On'
Description
I am using
action/cache@v3
in a GHES instance and it appears that we cannot restore the cache during PR runs:Here's the relevant bit from my workflow:
Here is the step where it caches the artifacts successfully:![CleanShot 2023-10-03 at 12 49 34@2x](https://github.com/actions/cache/assets/6510935/254b1d81-83b1-4ad3-b469-077e37ab84a9)
Here's the debug logs: