actions / runner-images

GitHub Actions runner images
MIT License
10.24k stars 3.08k forks source link

libcurl build failure on macos11 and macos-latest #7945

Closed disa6302 closed 1 year ago

disa6302 commented 1 year ago

Description

I am building a library on a github runner as part of github actions that requires building libcurl.

Libcurl version : 7.68.0 I am running this using cmake and I see this issue:

[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vtls/schannel_verify.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vtls/sectransp.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vtls/gskit.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vtls/mbedtls.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vtls/mesalink.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vtls/bearssl.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vquic/ngtcp2.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vquic/quiche.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vssh/libssh2.c.o
[ 16%] Building C object lib/CMakeFiles/libcurl.dir/vssh/libssh.c.o
[ 16%] Linking C shared library libcurl.dylib
Undefined symbols for architecture x86_64:
  "_EVP_PKEY_get_id", referenced from:
      _cert_stuff in openssl.c.o
      _get_cert_chain in openssl.c.o
  "_SSL_get1_peer_certificate", referenced from:
      _servercert in openssl.c.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[5]: *** [lib/libcurl.dylib] Error 1
make[4]: *** [lib/CMakeFiles/libcurl.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [build/src/project_libcurl-stamp/project_libcurl-build] Error 2
make[1]: *** [CMakeFiles/project_libcurl.dir/all] Error 2
make: *** [all] Error 2
CMake Error at CMake/Utilities.cmake:93 (message):
  CMake step for libcurl failed: 2
Call Stack (most recent call first):
  CMakeLists.txt:93 (build_dependency)

-- Configuring incomplete, errors occurred!
Error: Process completed with exit code 1.

How do I fix this? I run the same thing on a Mac 11 system and a Mac 12 EC2 instance and the builds pass in those. It seems to a GHA specific issue.

For context, I did nothing to change the cmake file and the same build passed last month. The build also runs fine on MacOS 13.

Platforms affected

Runner images affected

Image version and build link

https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5602647587/job/15177682079

Current runner version: '2.306.0'
Operating System
Runner Image
Runner Image Provisioner
GITHUB_TOKEN Permissions
Secret source: Actions
Prepare workflow directory
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v3' (SHA:c85c95e3d7[2](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5602647587/job/15177682079#step:1:2)51135ab7dc9ce3241c5835cc595a9)
Download action repository 'aws-actions/configure-aws-credentials@v1-node1[6](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5602647587/job/15177682079#step:1:7)' (SHA:023daa7fe5f7f8[17](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5602647587/job/15177682079#step:1:21)faa31fc0fc4a8d0fb6[22](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5602647587/job/15177682079#step:1:26)4ed0)
Complete job name: mac-os-build-gcc

Is it regression?

2.305.0

Expected behavior

The build passes clean.

Actual behavior

The build fails

Repro steps

Try building the linked SDK on a GHA running with the build configuration.

apcraig commented 1 year ago

I am seeing very similar problems starting this week on a totally different problem. I'm compiling with gnu fortran via mpifort for a 2 core MPI test case. It's just multi-core tests, if I compile with gfortran for a single core test case, it works great. The error I get is

mpifort -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -o /Users/runner/cice-dirs/runs/conda_macos_restart_gx3_2x1.macos-latest/cice  CICE.o CICE_FinalMod.o CICE_InitMod.o CICE_RunMod.o ice_arrays_column.o ice_blocks.o ice_boundary.o ice_broadcast.o ice_calendar.o ice_communicate.o ice_constants.o ice_diagnostics.o ice_diagnostics_bgc.o ice_distribution.o ice_domain.o ice_domain_size.o ice_dyn_eap.o ice_dyn_evp.o ice_dyn_evp_1d.o ice_dyn_shared.o ice_dyn_vp.o ice_exit.o ice_fileunits.o ice_flux.o ice_flux_bgc.o ice_forcing.o ice_forcing_bgc.o ice_gather_scatter.o ice_global_reductions.o ice_grid.o ice_history.o ice_history_bgc.o ice_history_drag.o ice_history_fsd.o ice_history_mechred.o ice_history_pond.o ice_history_shared.o ice_history_snow.o ice_history_write.o ice_init.o ice_init_column.o ice_kinds_mod.o ice_memusage.o ice_memusage_gptl.o ice_read_write.o ice_reprosum.o ice_restart.o ice_restart_column.o ice_restart_driver.o ice_restart_shared.o ice_restoring.o ice_shr_reprosum86.o ice_spacecurve.o ice_state.o ice_step_mod.o ice_timers.o ice_transport_driver.o ice_transport_remap.o icepack_aerosol.o icepack_age.o icepack_algae.o icepack_atmo.o icepack_brine.o icepack_firstyear.o icepack_flux.o icepack_fsd.o icepack_intfc.o icepack_isotope.o icepack_itd.o icepack_kinds.o icepack_mechred.o icepack_meltpond_lvl.o icepack_meltpond_topo.o icepack_mushy_physics.o icepack_ocean.o icepack_orbital.o icepack_parameters.o icepack_shortwave.o icepack_snow.o icepack_therm_bl99.o icepack_therm_itd.o icepack_therm_mushy.o icepack_therm_shared.o icepack_therm_vertical.o icepack_tracers.o icepack_warnings.o icepack_wavefracspec.o icepack_zbgc.o icepack_zbgc_shared.o icepack_zsalinity.o  -L/Users/runner/miniconda/envs/cice/lib -lnetcdf -lnetcdff -llapack
ld: file not found: /System/Library/Frameworks/Security.framework/Versions/A/Security for architecture x86_64
collect2: error: ld returned 1 exit status

I have tried macos-latest, macos-11, macos-13. I have compared the mac/compiler versions and no changes since last week. I've tried to debug the issue, have gotten no where. I've run the test case on my personal iMac without any problem. This directory, /System/Library/Frameworks/Security.framework/Versions/A/Security, doesn't exist in githubActions or on my iMac, and it's never been a problem before.

I have a PR into my repo to try to fix this, https://github.com/CICE-Consortium/CICE/pull/847, but have no clue why this started. Was working fine for years and as recently as last week.

disa6302 commented 1 year ago

Thanks @apcraig . Its sad that the github runners in general are pretty unstable and unreliable through new versions and this is not the first time I have been facing issues randomly popping up in unrelated portions without any relevant changes and I spend hours trying the build out via different avenues to verify if it is a valid issue or something GHA is doing.

I see you are using macos-13 in the attached link. When I try with macos-13, it mostly works, with a few things I need to figure in couple build configurations.

apcraig commented 1 year ago

macos-13 is just the last thing I tried, normally I'm using macos-latest.

vpolikarpov-akvelon commented 1 year ago

Hi @disa6302, @apcraig. Thank you for reporting this, we will take a look.

ilia-shipitsin commented 1 year ago

@disa6302 , I had a quick look ato build logs, it looks like compilation takes openssl-1.0.2 and does not see EVP_PKEY (which is available in openssl-1.1.1).

in curl I see that openssl dir can be passed using --with-openssl (see https://github.com/curl/curl/blob/master/.github/workflows/macos.yml)

small question, does compilation work on clean macos-11 and clean macos-12 ? if so, please provide minimal repro steps. thanks.

disa6302 commented 1 year ago

Thank you @ilia-shipitsin for the quick response. I dont see the openssl-1.0.2 being taken. And I am not sure where 1.0.2 would come from either because the system installed version is 3.x and the one being installed with the build script is 1.1.1t. I see this line in the attached build logs while trying to build libcurl.

2023-07-19T18:23:15.8726680Z -- Found OpenSSL: /Users/runner/work/amazon-kinesis-video-streams-producer-c/amazon-kinesis-video-streams-producer-c/open-source/lib/libcrypto.dylib (found version "1.1.1t")

I tried running the SDK build on a MacOS 11 and MacOS 12 EC2 instances and they run clean. It fails on macos-11 and macos-12 github action runners.

For github actions, these are the steps: https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/blob/test-macos-builds/.github/workflows/ci.yml#L25-L83. Once you clone the repo you could run the steps under Build repository.

For steps on MacOS 12 EC2 instance:

1. Clone the repo
2. mkdir build
3. cd build
4. cmake ..
5. make

Expectation: It would run clean.

ilia-shipitsin commented 1 year ago

thank you for the repro, I'll have a look

ilia-shipitsin commented 1 year ago

@disa6302 , I tried to build by using your steps

[ 66%] Performing build step for 'project_libopenssl'
In file included from apps/app_rand.c:10:
In file included from apps/apps.h:13:
In file included from ./e_os.h:16:
In file included from include/openssl/e_os2.h:243:
/Applications/Xcode_14.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/14.0.0/include/inttypes.h:21:15: fatal error: 'inttypes.h' file not found
#include_next <inttypes.h>
              ^~~~~~~~~~~~
1 error generated.
make[4]: *** [apps/app_rand.o] Error 1
make[3]: *** [all] Error 2
make[2]: *** [build/src/project_libopenssl-stamp/project_libopenssl-build] Error 2
make[1]: *** [CMakeFiles/project_libopenssl.dir/all] Error 2
make: *** [all] Error 2
CMake Error at CMake/Utilities.cmake:93 (message):
  CMake step for libopenssl failed: 2
Call Stack (most recent call first):
  CMakeLists.txt:72 (build_dependency)

-- Configuring incomplete, errors occurred!
Mac-2748:build runner$ 

did I miss something ?

disa6302 commented 1 year ago

@ilia-shipitsin ,

Can you try running these exports? I need them when I run them locally but somehow do not need them on the github runner.

export LDFLAGS="-L/usr/local/opt/openssl/lib"
export CPPFLAGS="-I/usr/local/opt/openssl/include"

If setting the values to these vars do not work, you can try this:

https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/blob/test-macos-builds/.github/workflows/ci.yml#L59-L60

disa6302 commented 1 year ago

@ilia-shipitsin ,

is there an update on this? I am now seeing this on macos-13 as well.

ilia-shipitsin commented 1 year ago

@disa6302 , sorry, my access to testing farm was revoked, I hope to get it back and test with updated repro steps.

disa6302 commented 1 year ago

Thank you @ilia-shipitsin . Would appreciate if you can prioritize this. Up until day before yesterday, I was able to work around with macos13. But, now, it fails in macos13 as well with the same error (and it is strange because the version on day before yest's run was 2.307.1 as well and there has been no change in the build process since then - it could be flakiness, but will see). Runner version is 2.307.1.

But the difference is in the working version, this was the set up:

Current runner version: '2.307.1'
Operating System
  macOS
  13.4
  [2](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:2)2F66
Runner Image
  Image: macos-1[3](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:3)
  Version: 20230611.2
  Included Software: https://github.com/actions/runner-images/blob/macos-13/20230611.2/images/macos/macos-13-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/macos-13%2F20230611.2
Runner Image Provisioner
  2.0.238.1
GITHUB_TOKEN Permissions
  Contents: read
  Metadata: read
Secret source: Actions
Prepare workflow directory
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v3' (SHA:c85c95e3d7251135ab7dc9ce32[4](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:4)1c[5](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:5)835cc595a9)
Download action repository 'aws-actions/configure-aws-credentials@v1-node1[6](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:7)' (SHA:023daa[7](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:8)fe5f7f[8](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:9)[17](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:21)faa31fc0fc4a8d0fb6[22](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5719379819/job/15500400642#step:1:26)4ed0)
Complete job name: mac-os-build-gcc

With the non working version it is:

Current runner version: '2.307.1'
Operating System
  macOS
  13.5
  [2](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:2)2G74
Runner Image
  Image: macos-1[3](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:3)
  Version: 20230801.2
  Included Software: https://github.com/actions/runner-images/blob/macos-13/20230801.2/images/macos/macos-13-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/macos-13%2F20230801.2
Runner Image Provisioner
  2.0.26[4](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:4).1
GITHUB_TOKEN Permissions
  Contents: read
  Metadata: read
Secret source: Actions
Prepare workflow directory
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v3' (SHA:c8[5](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:5)c95e3d7251135ab7dc9ce3241c5835cc595a9)
Download action repository 'aws-actions/configure-aws-credentials@v1-node1[6](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:7)' (SHA:023daa[7](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:8)fe5f7f[8](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:9)[17](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:21)faa31fc0fc4a8d0fb6[22](https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/actions/runs/5733465579/job/15538196972#step:1:26)4ed0)
Complete job name: mac-os-build-gcc
ilia-shipitsin commented 1 year ago

my access to testing farm is not something I can gain faster or not. I just can be patient.

ilia-shipitsin commented 1 year ago

I see that you initially reported an issue against "macos-latest" which is macos-12 now. however "working" link is againt "macos-13"

did you have successful build on macos-12 ?

disa6302 commented 1 year ago

Right now, it does not build on any of the macos runners.

As per my previous message, macos-13 fails as well and the difference seems to be in the macos version (13.4 vs 13.5).

However, the build succeeds locally on all macos versions

disa6302 commented 1 year ago

Ok. I got around the problem by running brew unlink openssl. This should not be required and I do not seem to need it in any of the local EC2 instances I run with the same version the runners are. Would still like to keep this ticket open to investigate if the runners are doing anything specific while setting up openssl.

ilia-shipitsin commented 1 year ago

well, runners indeed may have openssl installed (if that's required for some software). it usually happens in the way that customers want us to add something, we do so.

I still cannot investigate, an access to test farm not yet restored. But it looks like you already identified the cause. I'll close this ticket.