huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.81k stars 950 forks source link

candle-flash-attn linking error with Red Hat based distributions #1844

Open ivanbaldo opened 8 months ago

ivanbaldo commented 8 months ago

Hello. This was reported wrongly here https://github.com/EricLBuehler/candle-vllm/issues/25 but it's actually an issue with candle-core. Here is a Dockerfile reproducing the problem:

# syntax=docker/dockerfile:1

# Note: if building on a machine with a different GPU or no GPU then check
# https://developer.nvidia.com/cuda-gpus and pass the value without the decimal point to
# CUDA_COMPUTE_CAP directly without the $(...), for example for an A100 is CUDA_COMPUTE_CAP=80 and
# for an A10 is CUDA_COMPUTE_CAP=86.
#
# docker build --build-arg USERID=$(id -u) --build-arg \
#   CUDA_COMPUTE_CAP=$(nvidia-smi --query-gpu=compute_cap --format=csv | tail -n1 | tr -d .) \
#   -t local/candle-core -f candle-core.dockerfile .

# Select an available version from
# https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md:
# Doesn't work with Redhat based Linux https://github.com/EricLBuehler/candle-vllm/issues/25:
FROM nvidia/cuda:12.2.2-cudnn8-devel-rockylinux9 as build
#FROM nvidia/cuda:12.2.2-cudnn8-devel-ubuntu22.04 as build

ARG CUDA_COMPUTE_CAP
RUN dnf install -y openssl-devel && dnf clean all && rm -rf /var/cache/dnf/*
#RUN apt-get update && apt-get install -y curl libssl-dev pkg-config && rm -rf /var/lib/apt/lists/*
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y && ln -s /root/.cargo/bin/* /usr/local/bin
ADD https://github.com/huggingface/candle.git#main /candle-core
WORKDIR /candle-core
#RUN cargo build --features cuda,cudnn,flash-attn,nccl

#FROM nvidia/cuda:12.2.2-cudnn8-runtime-rockylinux9 as runtime
#FROM nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04 as runtime
ARG USERID=1000
RUN adduser -u $USERID user
#RUN adduser --disabled-password --gecos '' -u $USERID user
USER user

And here is partial output of the error:

error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.cargo/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "cc" "-m64" "/tmp/rustcw45Mzt/symbols.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.10ufns7rcj85d0zk.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.18tjqrs4q4ut6bno.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.1d63toc24d8so30r.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.1eb25emf0nhn734r.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.1rgujxhopgkuht3b.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.1sok2akty9bcbvlq.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.1zaxfbkba8032eo9.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.28vjoi5eh06df29b.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.2d0g40mh38sna4cf.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.2mlxqg8vqcq9uqsj.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.36r5t4t5fbuczl1.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.3801ijgshpp47nyh.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.3ddc6mgyov9yynsc.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.3di2487eruh844bm.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.3l7fn7bopjz0dthx.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4bvjep6vnnfegdcf.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4f6i0ldk8v6oohm.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4l051k3ng6v6wehb.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4oxvj5etaf15nnvc.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4r9n7q589n5lt895.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4rgku8moxqvk45fr.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.4tsi8fneo8fyhs4e.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.518cvqbkfavr7f3l.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.54043p7wklts3j8s.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.561m2knptw4b97gh.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.59jtc0k4rikmcplh.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.5e5i63wt0v7cwj19.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.5wdok8q0icgpazl.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.64icc232l2915h8.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.7ubvrn5zehf20ga.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.kpdr61wppewu1wv.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.qozgxtnk07mo4wb.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.rspqfy7gpae1isw.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.tmkkunbw3rxk20t.rcgu.o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0.1l8136i1poxy8c5d.rcgu.o" "-Wl,--as-needed" "-L" "/candle-core/target/debug/deps" "-L" "/usr/local/cuda/lib64" "-L" "/usr/local/cuda/lib64/stubs" "-L" "/usr/local/cuda/targets/x86_64-linux" "-L" "/usr/local/cuda/targets/x86_64-linux/lib" "-L" "/usr/local/cuda/targets/x86_64-linux/lib/stubs" "-L" "/usr/lib" "-L" "/usr/lib64" "-L" "/candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out" "-L" "/candle-core/target/debug/build/onig_sys-3dbff5b17b191331/out" "-L" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/candle-core/target/debug/deps/libcandle_wasm_example_whisper-db92647f6ba99ab2.rlib" "/candle-core/target/debug/deps/libwav-14c2d1c24ebd6c28.rlib" "/candle-core/target/debug/deps/libriff-8e053e9681624b9d.rlib" "/candle-core/target/debug/deps/libtokenizers-bef70cb175db9847.rlib" "/candle-core/target/debug/deps/libesaxx_rs-aacdf203a69cef70.rlib" "/candle-core/target/debug/deps/libregex-6ccee77bd4d615ae.rlib" "/candle-core/target/debug/deps/libunicode_normalization_alignments-ff8b46c01b27487c.rlib" "/candle-core/target/debug/deps/libsmallvec-b3ec60113aea24ab.rlib" "/candle-core/target/debug/deps/libspm_precompiled-c920d47396d24c8a.rlib" "/candle-core/target/debug/deps/libbase64-2032ab1b8853fafd.rlib" "/candle-core/target/debug/deps/libunicode_segmentation-bcfe6e82a7af1d73.rlib" "/candle-core/target/debug/deps/libnom-617a817d73033e70.rlib" "/candle-core/target/debug/deps/libunicode_categories-5db856f7c20b96fa.rlib" "/candle-core/target/debug/deps/libitertools-54ba58d082757f2e.rlib" "/candle-core/target/debug/deps/libmonostate-1adcd34e64840604.rlib" "/candle-core/target/debug/deps/libmacro_rules_attribute-33a18c6583a8cefb.rlib" "/candle-core/target/debug/deps/librayon_cond-2275ada98b0c8ea7.rlib" "/candle-core/target/debug/deps/libitertools-1bb084199e69f20f.rlib" "/candle-core/target/debug/deps/libderive_builder-c4725c952c36302b.rlib" "/candle-core/target/debug/deps/liblazy_static-67076bd63a6ff427.rlib" "/candle-core/target/debug/deps/libcandle_transformers-7bf2e09f0be0b3ae.rlib" "/candle-core/target/debug/deps/libserde_plain-703910719c5adc4c.rlib" "/candle-core/target/debug/deps/libfancy_regex-7e29522e68b295e6.rlib" "/candle-core/target/debug/deps/libbit_set-511486c31a14f986.rlib" "/candle-core/target/debug/deps/libbit_vec-51af7e270c17e705.rlib" "/candle-core/target/debug/deps/libregex_automata-d0cf0a86ff983684.rlib" "/candle-core/target/debug/deps/libaho_corasick-9efebcf457ce2816.rlib" "/candle-core/target/debug/deps/libregex_syntax-0d266e3910baead0.rlib" "/candle-core/target/debug/deps/libcandle_flash_attn-cc4e03c835b042e5.rlib" "/candle-core/target/debug/deps/libcandle_nn-bb205566a5aaa48e.rlib" "/candle-core/target/debug/deps/libcandle_core-03658133a63ef638.rlib" "/candle-core/target/debug/deps/libmemmap2-0efed5a8c70b05fe.rlib" "/candle-core/target/debug/deps/libzip-f2d067532b24d50d.rlib" "/candle-core/target/debug/deps/libcrc32fast-745120e0b53cd22d.rlib" "/candle-core/target/debug/deps/libyoke-7f28942e84f981fb.rlib" "/candle-core/target/debug/deps/libzerofrom-44d8e0b646231b48.rlib" "/candle-core/target/debug/deps/libstable_deref_trait-e68140142400a1b9.rlib" "/candle-core/target/debug/deps/libsafetensors-9b59390fcbfa6658.rlib" "/candle-core/target/debug/deps/libcudarc-bdb490c0f08c41bf.rlib" "/candle-core/target/debug/deps/libcandle_kernels-00f16f912f8e34cf.rlib" "/candle-core/target/debug/deps/libgemm-4faa2a45f6602589.rlib" "/candle-core/target/debug/deps/libgemm_c32-f6f5746a7f2347f3.rlib" "/candle-core/target/debug/deps/libgemm_c64-5adfda3d743db1c2.rlib" "/candle-core/target/debug/deps/libgemm_f64-97446a26e86877bd.rlib" "/candle-core/target/debug/deps/libgemm_f16-818965dd28ea1cf9.rlib" "/candle-core/target/debug/deps/libgemm_f32-7ff3241f3c9818ae.rlib" "/candle-core/target/debug/deps/libgemm_common-6bd86ff8b1398c28.rlib" "/candle-core/target/debug/deps/libpulp-ca6ef4de2d934e17.rlib" "/candle-core/target/debug/deps/libnum_complex-28016b9c085b2f11.rlib" "/candle-core/target/debug/deps/libdyn_stack-a26da0aa1ac353ee.rlib" "/candle-core/target/debug/deps/libreborrow-5ab15ae29df16eb6.rlib" "/candle-core/target/debug/deps/libraw_cpuid-2f6732f977bf253c.rlib" "/candle-core/target/debug/deps/libbitflags-dee6366056ea8fd5.rlib" "/candle-core/target/debug/deps/librayon-3e4129ffc6c72786.rlib" "/candle-core/target/debug/deps/librayon_core-ba77e1ffc5b6cf8c.rlib" "/candle-core/target/debug/deps/libcrossbeam_deque-7efc55e7bff57e71.rlib" "/candle-core/target/debug/deps/libcrossbeam_epoch-18bf6da94fc9021b.rlib" "/candle-core/target/debug/deps/libcrossbeam_utils-7c319090700a4249.rlib" "/candle-core/target/debug/deps/libeither-689e439a150a47fb.rlib" "/candle-core/target/debug/deps/libbyteorder-34348d75f49273e9.rlib" "/candle-core/target/debug/deps/libhalf-70b1d7221e14124a.rlib" "/candle-core/target/debug/deps/librand_distr-e3bf158f91795cae.rlib" "/candle-core/target/debug/deps/librand-0b64e4536e47b7ea.rlib" "/candle-core/target/debug/deps/librand_chacha-49f3b25ee256b89e.rlib" "/candle-core/target/debug/deps/libppv_lite86-75d3a003153bde67.rlib" "/candle-core/target/debug/deps/librand_core-3ea9da3a15396dc7.rlib" "/candle-core/target/debug/deps/libgetrandom-f019754046391b6f.rlib" "/candle-core/target/debug/deps/libnum_traits-87c302c15eac21df.rlib" "/candle-core/target/debug/deps/libbytemuck-8522dbf74e1e0ed9.rlib" "/candle-core/target/debug/deps/libanyhow-c59dd01fc2ab812a.rlib" "/candle-core/target/debug/deps/libyew_agent-3c9b245a22b9a1c3.rlib" "/candle-core/target/debug/deps/libgloo_worker-d5bf0dda531fb8d6.rlib" "/candle-core/target/debug/deps/libanymap2-76261abe025e5c3d.rlib" "/candle-core/target/debug/deps/libyew-4c02404d39d6eb17.rlib" "/candle-core/target/debug/deps/libconsole_error_panic_hook-402e1e5eb6675279.rlib" "/candle-core/target/debug/deps/libtracing-7f2c061d4e90d84e.rlib" "/candle-core/target/debug/deps/libtracing_core-96021176e20cccda.rlib" "/candle-core/target/debug/deps/libprokio-4f9f82e409fee316.rlib" "/candle-core/target/debug/deps/libtokio_stream-1a117d9bf14d8cea.rlib" "/candle-core/target/debug/deps/libtokio-210aef757e67dc59.rlib" "/candle-core/target/debug/deps/libnum_cpus-2db37473d94f8d17.rlib" "/candle-core/target/debug/deps/libsocket2-117a127ed4a399ac.rlib" "/candle-core/target/debug/deps/libmio-87ccdf877260d73a.rlib" "/candle-core/target/debug/deps/liblibc-50cf0445600f9cae.rlib" "/candle-core/target/debug/deps/libonce_cell-a5f155bc652541d7.rlib" "/candle-core/target/debug/deps/libpinned-03b57a40f6f60814.rlib" "/candle-core/target/debug/deps/libimplicit_clone-7b83b991a22574b3.rlib" "/candle-core/target/debug/deps/libfutures-172ccbb4d4ee13c1.rlib" "/candle-core/target/debug/deps/libfutures_executor-82559237f1d6da29.rlib" "/candle-core/target/debug/deps/libfutures_util-10f95259a645f20f.rlib" "/candle-core/target/debug/deps/libmemchr-9118be60aee6f566.rlib" "/candle-core/target/debug/deps/libfutures_io-636acfa736853ffd.rlib" "/candle-core/target/debug/deps/libpin_project_lite-78bdbfb86705d97e.rlib" "/candle-core/target/debug/deps/libfutures_task-4c71d49251518615.rlib" "/candle-core/target/debug/deps/libpin_utils-1301f4c13d11a74a.rlib" "/candle-core/target/debug/deps/libindexmap-9b2194397731dbea.rlib" "/candle-core/target/debug/deps/libhashbrown-d0c6f6d47cd337a5.rlib" "/candle-core/target/debug/deps/libgloo-42ede8e0bdf72ff0.rlib" "/candle-core/target/debug/deps/libgloo_worker-fd57e3de0e0dad9f.rlib" "/candle-core/target/debug/deps/libbincode-6a301ab89c98153c.rlib" "/candle-core/target/debug/deps/libgloo_timers-7060d26a05399c5f.rlib" "/candle-core/target/debug/deps/libgloo_storage-df2e2f87391ec287.rlib" "/candle-core/target/debug/deps/libgloo_render-fc50c8a1966a62a6.rlib" "/candle-core/target/debug/deps/libgloo_net-42afde3dfa475e20.rlib" "/candle-core/target/debug/deps/libwasm_bindgen_futures-f75f4eddbf22dd60.rlib" "/candle-core/target/debug/deps/libhttp-6519f813ff548ccf.rlib" "/candle-core/target/debug/deps/libbytes-ed3aeb66deaa63ba.rlib" "/candle-core/target/debug/deps/libfnv-1c9d80d36ddc3b78.rlib" "/candle-core/target/debug/deps/libpin_project-41847709bfbd91d1.rlib" "/candle-core/target/debug/deps/libfutures_channel-36437ab1685b9310.rlib" "/candle-core/target/debug/deps/libfutures_sink-78a50a88f4a1a030.rlib" "/candle-core/target/debug/deps/libfutures_core-2122870321a033cf.rlib" "/candle-core/target/debug/deps/libgloo_history-b5a28cc4b94a2944.rlib" "/candle-core/target/debug/deps/libserde_wasm_bindgen-fac028d94b6532ae.rlib" "/candle-core/target/debug/deps/libserde_urlencoded-77f62df8a45e518e.rlib" "/candle-core/target/debug/deps/libform_urlencoded-c4c03fefce78c394.rlib" "/candle-core/target/debug/deps/libpercent_encoding-a434dbd6df2699c7.rlib" "/candle-core/target/debug/deps/libthiserror-43e0a7119ded620d.rlib" "/candle-core/target/debug/deps/libgloo_file-4497c05486c7c5a4.rlib" "/candle-core/target/debug/deps/libgloo_events-3cd3b309714a51d9.rlib" "/candle-core/target/debug/deps/libgloo_dialogs-631b38113db3b0f8.rlib" "/candle-core/target/debug/deps/libgloo_console-f2a66b4f05ab9759.rlib" "/candle-core/target/debug/deps/libgloo_utils-67638707b3f3474d.rlib" "/candle-core/target/debug/deps/libserde_json-cba04ec4bea8cb8b.rlib" "/candle-core/target/debug/deps/libitoa-6976aa51a2dfe178.rlib" "/candle-core/target/debug/deps/libryu-5a33a811a2310270.rlib" "/candle-core/target/debug/deps/libserde-7c2d7c38a9101b39.rlib" "/candle-core/target/debug/deps/libslab-5cf5cec69fe46771.rlib" "/candle-core/target/debug/deps/libwasm_logger-51ce0aae60519f94.rlib" "/candle-core/target/debug/deps/libweb_sys-c757427a63066f3a.rlib" "/candle-core/target/debug/deps/libjs_sys-238abf276709f068.rlib" "/candle-core/target/debug/deps/libwasm_bindgen-4e9208842f41b165.rlib" "/candle-core/target/debug/deps/libcfg_if-a4e1ca4231ab3b1f.rlib" "/candle-core/target/debug/deps/liblog-cb484c551ef35a32.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-66d8041607d2929b.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-a57e2388c0aea9b1.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-dcd9be90ae2cb505.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-516789932d161b4e.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-1ff34b0cf871cb60.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-0c110dd0650d6cb7.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-a6e97aae2681ad8f.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-b93dac2525ec4d1e.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-ce1d65fb391ae98b.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-8933a2fb54d88492.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-306712ebb1ee1a3f.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-349c574f342b0d30.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-65c422a3ad95273d.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-7e6330a6c0cb9441.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-39c59240bfdfab27.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-e9d126c51bb8b2bb.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-5af394d9b1f07bdc.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-693a8f23970c5917.rlib" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-13fc9d1ed9c7a2bc.rlib" "-Wl,-Bdynamic" "-lflashattention" "-lcudart" "-lstdc++" "-lcuda" "-lnccl" "-lnvrtc" "-lcurand" "-lcublas" "-lcublasLt" "-lcudnn" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/candle-core/target/debug/deps/app-6dac90c0ce2f2fc0" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs"
  = note: /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_api-59d12f2bec85f63.o): relocation R_X86_64_32 against `.nvFatBinSegment' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim128_fp16_sm80-759fdfecd1f0ed1c.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi32ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi128ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim160_fp16_sm80-17db6cdd19f7f98b.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi128ELi32ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi160ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim192_fp16_sm80-3981fe996a7e8814.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim224_fp16_sm80-54d101fd022eab36.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim256_fp16_sm80-6bbb415157454ca9.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi128ELi64ELi8ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi256ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim32_fp16_sm80-3a7585e74a278dc3.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim64_fp16_sm80-a93563dad84e2972.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim96_fp16_sm80-791226771e2c8c97.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim128_bf16_sm80-f1ff254233809e96.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi32ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi128ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim160_bf16_sm80-b8e226bc00ecbaf1.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi128ELi32ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi160ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim192_bf16_sm80-f7453c8601d43b17.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim224_bf16_sm80-9b2b93dbac21043c.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim256_bf16_sm80-21dd0f7dd998e506.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi128ELi64ELi8ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi256ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim32_bf16_sm80-aca7d8fdce93ef53.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim64_bf16_sm80-eaa7ce7f57eb7351.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-core/target/debug/build/candle-flash-attn-7369692dce0d1687/out/libflashattention.a(flash_fwd_hdim96_bf16_sm80-f51ba409eb93ce41.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          collect2: error: ld returned 1 exit status

Thanks!!!

LaurentMazare commented 8 months ago

That seems more like a generic cuda issue than something candle specific. Quickly googling about the error message, I got this issue which suggested adding --compiler-options -fPIC to CUDA_NVCC_FLAGS (this environment variable makes it easy to add flags to the nvcc calls that are made when cargo builds the flash attn kernels). Did you already try something like this?

ivanbaldo commented 8 months ago

Tried compiling like this but same error still: CUDA_NVCC_FLAGS='--compiler-options -fPIC' cargo build --features cuda,cudnn,flash-attn,nccl

LaurentMazare commented 8 months ago

Maybe try -fPIE rather than -fPIC as per the error message. Besides this you should probably try to google about it as I doubt it's actually candle specific.

ivanbaldo commented 8 months ago

This didn't work neither: CUDA_NVCC_FLAGS='--compiler-options -fPIE' cargo build --features cuda,cudnn,flash-attn,nccl

yinqiwen commented 7 months ago

candle-flash-attn use build.rs to compile cuda code, but it did not use any env like CUDA_NVCC_FLAGS; currently, u have to add the flags --compiler-options -fPIC at https://github.com/huggingface/candle/blob/main/candle-flash-attn/build.rs#L63

ivanbaldo commented 7 months ago

Thanks @yinqiwen! As you suggested, adding the following to that file, the compilation works:

        .arg("--compiler-options")
        .arg("-fPIC")

Maybe these options could be added by default?