conan-io / conan-center-index

Recipes for the ConanCenter repository
https://conan.io/center
MIT License
971 stars 1.79k forks source link

[question] mlpack fails to build #25655

Open SOLDATO2 opened 1 month ago

SOLDATO2 commented 1 month ago

What is your question?

Hi! I've been trying download https://conan.io/center/recipes/mlpack in a simple project structure. Unfortunately, I'm getting an error during the build process.

[ 73%] Built target LAPACKE
make: *** [Makefile:146: all] Error 2

openblas/0.3.25: ERROR: 
Package '3a8f6a4840165299e9dc63ab7938bfaed3babf0c' build failed
openblas/0.3.25: WARN: Build folder /home/squade/.conan2/p/b/openbd078a2b8d375a/b/build/Release
ERROR: openblas/0.3.25: Error in build() method, line 235
        cmake.build()
        ConanException: Error 2 while executing

CMakesLists:

cmake_minimum_required(VERSION 3.30)
project(mlcpp CXX)

find_package(mlpack REQUIRED)

add_executable(${PROJECT_NAME} src/main.cpp)
target_link_libraries(${PROJECT_NAME} mlpack::mlpack)

Cmake version: 3.30.3 Conan version 2.7.1

Default conan profile:

[settings]
arch=x86_64
build_type=Release
compiler=gcc
compiler.cppstd=gnu17
compiler.libcxx=libstdc++11
compiler.version=14
os=Linux

Commands I've ran to build: conan install . --build=missing --settings=build_type=Release

Have you read the CONTRIBUTING guide?

memsharded commented 1 month ago

Hi @SOLDATO2

Thanks for your report.

This doesn't necessarily looks like a problem in the Conan client, it could be an issue in the ConanCenter recipe for openblas/0.3.25. It would be good to know a bit more about your environment:

I have just tried to build it with conan install --requires=openblas/0.3.25 --build=openblas* (to force the build, as there is a binary in ConanCenter for my config), in an Ubuntu 22.04 with gcc 11, and it seems it worked fine.

SOLDATO2 commented 1 month ago

Hi,

Conan version: 2.7.1 Linux: Linux 6.10.13-3-MANJARO OS: Manjaro Linux x86_64

There's nothing special about my gcc compiler, I reckon. GCC: gcc (GCC) 14.2.1 20240910

here's the complete log of conan install --requires=openblas/0.3.25 --build=missing conan_log.txt

memsharded commented 1 month ago

I don't see much difference in the logs, besides the core:

-- GETARCH results:
CORE=ZEN
LIBCORE=zen
NUM_CORES=12
HAVE_MMX=1
HAVE_SSE=1
HAVE_SSE2=1
HAVE_SSE3=1
HAVE_SSSE3=1
HAVE_SSE4_1=1
HAVE_SSE4_2=1
HAVE_SSE4A=1
HAVE_AVX=1
HAVE_AVX2=1
HAVE_FMA3=1
MAKEFLAGS += -j 12

while I have

CORE=SKYLAKEX
LIBCORE=skylakex
NUM_CORES=8
HAVE_MMX=1
HAVE_SSE=1
HAVE_SSE2=1
HAVE_SSE3=1
HAVE_SSSE3=1
HAVE_SSE4_1=1
HAVE_SSE4_2=1
HAVE_AVX=1
HAVE_AVX2=1
HAVE_AVX512VL=1
HAVE_FMA3=1
MAKEFLAGS += -j 8

Not sure if some of those might be making the compiler or linker to fail, but it is weird, because there is no error message, which seems to indicate that it is not a normal compilation error, but like the process failing.

Maybe a way to check this is to try to reproduce in some docker image, something that we can share the machine and reproduce on both sides.

SOLDATO2 commented 1 month ago

Unfortunately, my experience with docker is minimal. But I do think it's worth mentioning that I was getting the same error on a completely separate and fresh arch installation in the same computer. So it might be related to arch?

Dylan-Gresham commented 1 month ago

Hello, I'm also running into this.

CMakeLists.txt:

cmake_minimum_required(VERSION 3.30)
project(mlcpp CXX)

find_package(mlpack REQUIRED)

add_executable(${PROJECT_NAME} src/cpp_quickstart_1.cpp)
target_link_executables(${PROJECT_NAME} mlpack::mlpack)

conanfile.txt:

[requires]
mlpack/4.4.0
[generators]
CMakeDeps
CMakeToolchain
[layout]
cmake_layout

Dockerfile:

# Arch base image
FROM archlinux

LABEL description="Container to test building mlpack from conan, conan-io/conan-center-index#25655"

# Install yay
RUN printf "[archlinuxcn]\nServer=https://repo.archlinuxcn.org/\$arch\n" >> /etc/pacman.conf && \
    rm -fr /etc/pacman.d/gnupg && pacman-key --init && pacman-key --populate archlinux && \
    pacman -Syyu --noconfirm archlinuxcn-keyring && \
    pacman -S --noconfirm yay

RUN useradd builder && \
    printf "123\n123\n" | passwd builder && \
    mkdir -p /home/builder && \
    chown builder /home/builder && \
    chgrp builder /home/builder && \
    echo "builder ALL=(ALL) NOPASSWD: /usr/bin/pacman" >> /etc/sudoers

# Install base-devel
RUN pacman -S --noconfirm base-devel

# Switch to the builder user
USER builder

# Install conan
RUN yay -S --noconfirm conan

# Copy over files
WORKDIR /mlpack-test
COPY CMakeLists.txt .
COPY conanfile.txt .

# Create conan profile
RUN conan profile detect --force

# Try to build
RUN conan install . --build=missing --settings=build_type=Release

The directory structure that I used was the following:

.
|-- CMakeLists.txt
|-- conanfile.txt
|-- Dockerfile

The docker version that I used is: Docker version 27.3.1, build ce1223035a

Command that produces the error: docker build --tag 'mlpack-test' .

valgur commented 1 month ago

Do you get the same error with the latest openblas/0.3.27 version (e.g. if you add self.requires("openblas/0.3.27", override=True))? The error is related to OpenBLAS itself rather than CCI or Conan, so perhaps this has been fixed in a newer version?

Dylan-Gresham commented 1 month ago

I'm unfamiliar with python conanfiles so I might've missed something but after overriding the OpenBLAS version I'm getting the same error, yes.

Here's the conanfile.py:

from conan import ConanFile

class MyProjectConan(ConanFile):
    name = "my_project"
    version = "1.0"

    # Package dependencies
    requires = "mlpack/4.4.0"

    # Generators to create necessary files
    generators = "CMakeDeps", "CMakeToolchain"

    def build_requirements(self):
        self.requires("openblas/0.3.27", override=True)

Using requirements() instead of build_requirements() also produces the error.

And the updated Dockerfile:

# Arch base image
FROM archlinux

LABEL description="Container to test building mlpack from conan, conan-io/conan-center-index#25655"

# Install yay
RUN printf "[archlinuxcn]\nServer=https://repo.archlinuxcn.org/\$arch\n" >> /etc/pacman.conf && \
    rm -fr /etc/pacman.d/gnupg && pacman-key --init && pacman-key --populate archlinux && \
    pacman -Syyu --noconfirm archlinuxcn-keyring && \
    pacman -S --noconfirm yay

RUN useradd builder && \
    printf "123\n123\n" | passwd builder && \
    mkdir -p /home/builder && \
    chown builder /home/builder && \
    chgrp builder /home/builder && \
    echo "builder ALL=(ALL) NOPASSWD: /usr/bin/pacman" >> /etc/sudoers

# Install base-devel
RUN pacman -S --noconfirm base-devel cmake

# Switch to the builder user
USER builder

# Install conan
RUN yay -S --noconfirm conan

# Copy over files
WORKDIR /mlpack-test
COPY CMakeLists.txt .
COPY conanfile.py .

# Create conan profile
RUN conan profile detect --force

# Try to build
RUN conan install . --build=missing --settings=build_type=Release

I'm receiving the same error with both the Dockerfile and running on my own machine.

memsharded commented 1 month ago

Thanks for the details to reproduce. I am having some local issues with the docker:

10.48 :: Synchronizing package databases...
14.05 error: failed retrieving file 'core.db' from geo.mirror.pkgbuild.com : SSL certificate problem: unable to get local issuer certificate
14.05 error: failed retrieving file 'extra.db' from geo.mirror.pkgbuild.com : SSL certificate problem: unable to get local issuer certificate
14.05 error: failed retrieving file 'archlinuxcn.db' from repo.archlinuxcn.org : SSL certificate problem: unable to get local issuer certificate

I suspect it can be my local network blocking it, I am checking it.

memsharded commented 1 month ago

Been able to build and reproduce in the docker image.

It seems it boils down to a compilation error:

/home/builder/conan-center-index/recipes/openblas/all/src/lapack-netlib/SRC/spstf2.c:796:35: error: passing argument 1 of ‘dmaxloc_’ from incompatible pointer type [-Wincompatible-pointer-types]
  796 |                 itemp = mymaxloc_(&work[1], &i__2, &i__3, &c__1);
      |                                   ^~~~~~~~
      |                                   |
      |                                   real * {aka float *}

So there is something in the platform that makes this build fail. this wouldn't seem really related to Conan, but need to investigate further.

valgur commented 1 month ago

Looks like a harmless warning that fails with -Werror in generated C code. There's already a fix applied upstream: https://github.com/OpenMathLib/OpenBLAS/pull/4894 We just need to backport it to older versions or set -Wno-error=incompatible-pointer-types.

memsharded commented 1 month ago

Thanks for the feedback.

Then the summary is:

Then it seems the best action would be to wait for openblas 0.3.29 (a validate() excluding gcc 14 in previous versions might help, but not critical, due to the load on ConanCenter, I wouldn't push heavily for this, maybe just aligning it with the addition of 0.3.29) to get the official fix. Pushing to the very latest version of openblas rather than trying to patch older versions seems also more aligned with the archlinux too.

memsharded commented 1 month ago

Transferring this to ConanCenter, this wouldn't be a Conan issue, but a ConanCenter recipe issue.