Open rschu1ze opened 5 months ago
Hi, If you have built locally, could you provide the diagnostic files ?
I was able to get your failure and the diagnostic files. It seems related to Ubuntu which set the default cpu to z13
.
The diagnostics files are actually pretty useless. I was able to get the same result with the following LLVM IR:
; llc-18 -mcpu=z13 -mtriple=s390x-linux-gnu
define i1 @f(i128 %n.128, i128 %m.128) local_unnamed_addr #0 {
; %n.128 = zext i64 %n to i128
%mul = tail call { i128, i1 } @llvm.umul.with.overflow.i128(i128 %n.128, i128 %m.128)
%mul.ov = extractvalue { i128, i1 } %mul, 1
ret i1 %mul.ov
}
; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare { i128, i1 } @llvm.umul.with.overflow.i128(i128, i128) #1
attributes #0 = { mustprogress nofree nosync nounwind willreturn memory(none) }
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
On both debian and ubuntu I get the failure.
After bisection, the first bad commit is a65ccc1b9.
It seems the commit a356e6cc fixes the issue at least on the surface. It changes the implementation of expandMULO
. I don t understant what s going on yet. Need some time.
Maybe it provides a path to express MUL i128 as i64 ops (which is also done by abseil itself in int128.h).
also the abseil commit causes a regression because it replaces a reference to an external symbol to a constexpr equals to -1 (i128), which allows foldMultiplicationOverflowCheck
to match.
z13
adds new SIMD units with 128-bit vector registers. It seems to make i128
a legal type, but the expansion for MULO
didn't support the MULO i128, i128
case until recently.
@rschu1ze The quick and easy fix to your problem is to explicitly set a model anterior to z13
until you can drop it when you switch to LLVM >= 19.
FIY:
z10
zEC12
and Ubuntu to z13
Thanks a lot for checking.
The issue is actually not a problem for us, we'll wait till LLVM/Clang >= 19.
ClickHouse is a analytical database written in C++. It uses lots of third-party libraries (here), and it is built on multiple platforms including x86, ARM, PowerPC, RISC-V, and s390x.
I bumped the submodule for Google's abseil libraries from a code state of Nov 23 to a code state of Jun 24 here: https://github.com/ClickHouse/ClickHouse/pull/65048
Results of the builds made by ClickHouse's CI are here: https://s3.amazonaws.com/clickhouse-test-reports/65048/1f17ddc6fe35be95736b448ebb3b73123c034196/ci_running.html (click "ClickHouse special build check", then "binary_s390x").
In case the logs are no longer available, I'll paste them here:
So this looks like a compiler error ... I was able to bisect the changes in abseil that caused this and it turned out to be this innocent little commit: https://github.com/abseil/abseil-cpp/commit/34604d5b1f6ae14c65b3992478b59f7108051979
The error only happens on s390, builds on all other platforms are okay.
A reproducible example using compiler explorer may be difficult, let me share how I was able to reproduce this locally:
1f17ddc
(#65048) will do the job.Running CMake, then ninja will fail soon-ish with above error.
Kindly let me know if I can help to analyze this further, thanks.