apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.17k stars 3.46k forks source link

[MATLAB] Update MATLAB CI workflows to use MATLAB `R2023b` #37809

Open kevingurney opened 11 months ago

kevingurney commented 11 months ago

Describe the enhancement requested

The MATLAB CI workflows fail on Windows when using R2023b.

This is less than ideal since the latest publicly available MATLAB version is R2023b.

We should investigate why this failure is occurring on Windows so that we can build with R2023b in CI.

Component(s)

Continuous Integration, MATLAB

kevingurney commented 9 months ago

@sgilmore10 and I have been investigating this issue more.

We managed to get a successful Windows build using MATLAB R2023b:

https://github.com/mathworks/arrow/actions/runs/6949608081/job/18908073567#step:9:526

However, we had to remove Ninja and the call to vcvarsall.bat from the Windows CI workflow. We all called cmake directly from the GitHub Actions workflow file, rather than calling matlab_build.sh using bash -c.

We noticed that the MATLAB version with Ninja and vcvarsall.bat is reported as "unknown":

https://github.com/apache/arrow/actions/runs/6250586979/job/16972019524?pr=37773#step:9:34

However, without Ninja and vcvarsall.bat, the MATLAB version is reported as "23.2":

https://github.com/mathworks/arrow/actions/runs/6949608081/job/18908073567#step:9:550

kou commented 9 months ago

Interesting. One of downsides by removing Ninja is that we can't use ccache. It will increase build time.

Anyway, could you try with -DMATLAB_FIND_DEBUG=ON again to show more debug messages?

kevingurney commented 9 months ago

Thanks for the suggestion of using -DMATLAB_FIND_DEBUG=ON!

We actually figured out a way to continue using Ninja (and the rest of the existing CI workflow code).

By setting Matlab_ROOT_DIR and MATLAB_ADDITIONAL_VERSIONS explicitly, we were able to get the expected MATLAB version to be detected.

Example CMake code: https://github.com/apache/arrow/commit/0114138f856bafb0ddbcf34073d42b63401a1efc#diff-282bdaf7afd3f3d7974e3ab41857a65e3eea6a566107ea7d209ba3fec72e2e77R22 Successful CI Run with Ninja: https://github.com/mathworks/arrow/actions/runs/6950057226/job/18909470493 Correct MATLAB version being detected: https://github.com/mathworks/arrow/actions/runs/6950057226/job/18909470493#step:9:34

This seems like a fairly good solution that should also hopefully work when future versions of MATLAB are released (i.e. we won't need to constantly update the minimum required CMake version in order to get access to the latest FindMatlab code).

Setting the Matlab_ROOT_DIR explicitly may also be more reliable in general because FindMatlab uses a variety of heuristics to locate MATLAB and associated libraries (which generally, seem to work quite well). However, we have seen a few sporadic issues in CI that might be related to the heuristics not working 100% of the time (e.g. https://github.com/mathworks/libmexclass/issues/58).