EricLBuehler / candle-vllm

Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
MIT License
231 stars 23 forks source link

candle-flash-attn linking error with Red Hat based distributions #25

Closed ivanbaldo closed 6 months ago

ivanbaldo commented 7 months ago

I am trying to make the following (unfinished) Dockerfile work:

# Note: if building on a machine with a different GPU or no GPU then check
# https://developer.nvidia.com/cuda-gpus and pass the value without the decimal point to
# CUDA_COMPUTE_CAP directly without the $(...), for example for an A100 is CUDA_COMPUTE_CAP=80 and
# for an A10 is CUDA_COMPUTE_CAP=86.
#
# docker build --build-arg USERID=$(id -u) --build-arg \
#   CUDA_COMPUTE_CAP=$(nvidia-smi --query-gpu=compute_cap --format=csv | tail -n1 | tr -d .) \
#   -t local/candle-vllm-bench .

# Select an available version from
# https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md:
FROM nvidia/cuda:12.3.1-devel-rockylinux9
ARG USERID=1000
ARG CUDA_COMPUTE_CAP
RUN yum install -y cargo libcudnn8-devel openssl-devel git && yum clean all && \
    rm -rf /var/cache/yum/*
RUN git clone https://github.com/EricLBuehler/candle-vllm
WORKDIR /candle-vllm
RUN cargo build --release --features cuda,cudnn,flash-attn,nccl
RUN adduser -u $USERID user
USER user

But it fails with:

error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "cc" "-m64" "/tmp/rustczqulH1/symbols.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.0.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.1.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.10.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.11.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.12.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.13.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.14.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.15.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.2.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.3.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.4.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.5.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.6.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.7.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.8.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.9.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.4psql9m0o7iw6sqs.rcgu.o" "-Wl,--as-needed" "-L" "/candle-vllm/target/release/deps" "-L" "/candle-vllm/target/release/build/zstd-sys-51991617680764ab/out" "-L" "/usr/local/cuda/lib64" "-L" "/usr/local/cuda/lib64/stubs" "-L" "/usr/local/cuda/targets/x86_64-linux" "-L" "/usr/local/cuda/targets/x86_64-linux/lib" "-L" "/usr/local/cuda/targets/x86_64-linux/lib/stubs" "-L" "/usr/lib" "-L" "/usr/lib64" "-L" "/candle-vllm/target/release/build/bzip2-sys-f7fb57a3f4e98cc1/out/lib" "-L" "/candle-vllm/target/release/build/ring-a59330cc6e943984/out" "-L" "/candle-vllm/target/release/build/lz4-sys-c90b3b6e3d6da391/out" "-L" "/candle-vllm/target/release/build/esaxx-rs-83f1f68488f360a8/out" "-L" "/candle-vllm/target/release/build/onig_sys-d0c2f3461f43020d/out" "-L" "/candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/candle-vllm/target/release/deps/libenv_logger-0f0fa188a1404846.rlib" "/candle-vllm/target/release/deps/libtermcolor-c53cf66b9b32e10f.rlib" "/candle-vllm/target/release/deps/libis_terminal-cdf9c5266fcbba03.rlib" "/candle-vllm/target/release/deps/librustix-a629012946c99e6d.rlib" "/candle-vllm/target/release/deps/liblinux_raw_sys-15bed2ca91cf42a8.rlib" "/candle-vllm/target/release/deps/libhumantime-1dc284c82c7f0559.rlib" "/candle-vllm/target/release/deps/libcandle_vllm-ce9b07d51787770c.rlib" "/candle-vllm/target/release/deps/libchrono-ae2c4cf3aacef826.rlib" "/candle-vllm/target/release/deps/libiana_time_zone-2bd86fbdc9e46a38.rlib" "/candle-vllm/target/release/deps/libhf_hub-b2415c762b503a90.rlib" "/candle-vllm/target/release/deps/libdirs-45aa89c180ae36f2.rlib" "/candle-vllm/target/release/deps/libdirs_sys-b0294348c2e4986c.rlib" "/candle-vllm/target/release/deps/liboption_ext-3db96de540040126.rlib" "/candle-vllm/target/release/deps/libureq-22a62ebb34562523.rlib" "/candle-vllm/target/release/deps/libnative_tls-addec962e00a97ff.rlib" "/candle-vllm/target/release/deps/libopenssl_probe-e135bf478bd9e62b.rlib" "/candle-vllm/target/release/deps/libopenssl-f7e740960c8b0b56.rlib" "/candle-vllm/target/release/deps/libforeign_types-434e4620cdd2963d.rlib" "/candle-vllm/target/release/deps/libforeign_types_shared-3cd91dddd8b3059a.rlib" "/candle-vllm/target/release/deps/libopenssl_sys-2724f2f05b6f6e71.rlib" "/candle-vllm/target/release/deps/libwebpki_roots-fb31dcc12f4e6db5.rlib" "/candle-vllm/target/release/deps/librustls-ca4a80b00d74d11d.rlib" "/candle-vllm/target/release/deps/libsct-d1a0a53864376724.rlib" "/candle-vllm/target/release/deps/libwebpki-8db93ee63982280a.rlib" "/candle-vllm/target/release/deps/libring-c45b21a3fb043429.rlib" "/candle-vllm/target/release/deps/libspin-a5bca8ced7fc453c.rlib" "/candle-vllm/target/release/deps/libuntrusted-766afbb3ef44c1d1.rlib" "/candle-vllm/target/release/deps/libcandle_lora_transformers-c49058ffb6d7068a.rlib" "/candle-vllm/target/release/deps/libtqdm-e47c7a840c2fc706.rlib" "/candle-vllm/target/release/deps/libcrossterm-f705860770d94db8.rlib" "/candle-vllm/target/release/deps/libsignal_hook_mio-ec3a5a299cc915e5.rlib" "/candle-vllm/target/release/deps/libsignal_hook-49df15a2181bf250.rlib" "/candle-vllm/target/release/deps/libanyhow-78648c12fa2eaee5.rlib" "/candle-vllm/target/release/deps/libcandle_lora-0543f2db3a02f6c2.rlib" "/candle-vllm/target/release/deps/libtrc-af4d2dc9e955d45c.rlib" "/candle-vllm/target/release/deps/libuuid-8e9abe15319c7747.rlib" "/candle-vllm/target/release/deps/libcandle_transformers-3a408f703fe757e5.rlib" "/candle-vllm/target/release/deps/libserde_plain-9edacf8e6b8b5e3b.rlib" "/candle-vllm/target/release/deps/libcandle_flash_attn-6ec38f8ed9aac30d.rlib" "/candle-vllm/target/release/deps/libdyn_fmt-ca01837b2f65b0b1.rlib" "/candle-vllm/target/release/deps/libfutures-813f484dc1c71e4c.rlib" "/candle-vllm/target/release/deps/libfutures_executor-cdd38bae408d4ce8.rlib" "/candle-vllm/target/release/deps/libcandle_sampling-07b86ed24f500345.rlib" "/candle-vllm/target/release/deps/libcandle_nn-3eaedbdadbe5fbb5.rlib" "/candle-vllm/target/release/deps/libtokenizers-61b7f12c56fed2c5.rlib" "/candle-vllm/target/release/deps/libesaxx_rs-c3b0fa8f52cc413c.rlib" "/candle-vllm/target/release/deps/libunicode_normalization_alignments-025da513407d9879.rlib" "/candle-vllm/target/release/deps/libspm_precompiled-8a5e3784a84b6fa0.rlib" "/candle-vllm/target/release/deps/libbase64-a00060132962802d.rlib" "/candle-vllm/target/release/deps/libunicode_segmentation-0609f6ce0b27032d.rlib" "/candle-vllm/target/release/deps/libnom-828591b7d6e9f08d.rlib" "/candle-vllm/target/release/deps/libunicode_categories-4b2d8309eb580595.rlib" "/candle-vllm/target/release/deps/libmonostate-121edb8fb43689e8.rlib" "/candle-vllm/target/release/deps/libmacro_rules_attribute-fbe2172e90fd6d9d.rlib" "/candle-vllm/target/release/deps/libindicatif-5ac26ff2181c3839.rlib" "/candle-vllm/target/release/deps/libportable_atomic-37fa7d733d3c2283.rlib" "/candle-vllm/target/release/deps/libnumber_prefix-fcbd61cd7f0fb674.rlib" "/candle-vllm/target/release/deps/libconsole-927989bf813852d8.rlib" "/candle-vllm/target/release/deps/libunicode_width-4a01194dbfae8c91.rlib" "/candle-vllm/target/release/deps/librayon_cond-ec5fdcb09b40065c.rlib" "/candle-vllm/target/release/deps/libitertools-87b264833edf6f52.rlib" "/candle-vllm/target/release/deps/libonig-40dabd6ed5124b91.rlib" "/candle-vllm/target/release/deps/libonig_sys-90597c1391bce008.rlib" "/candle-vllm/target/release/deps/libderive_builder-3471ddeab47c0b9a.rlib" "/candle-vllm/target/release/deps/liblazy_static-852800890c81fb22.rlib" "/candle-vllm/target/release/deps/libclap-23394ec333e54596.rlib" "/candle-vllm/target/release/deps/libclap_builder-41cde94296fdb820.rlib" "/candle-vllm/target/release/deps/libstrsim-bfb3799e9677cd4d.rlib" "/candle-vllm/target/release/deps/libanstream-d284661ab137b824.rlib" "/candle-vllm/target/release/deps/libanstyle_query-d08e7c102e46eb49.rlib" "/candle-vllm/target/release/deps/libcolorchoice-d9fe16d50a3dd803.rlib" "/candle-vllm/target/release/deps/libanstyle_parse-6ac7d6e179081361.rlib" "/candle-vllm/target/release/deps/libutf8parse-86e737e0d4678582.rlib" "/candle-vllm/target/release/deps/libclap_lex-3a6b7689365ae37a.rlib" "/candle-vllm/target/release/deps/libanstyle-9a261b265642b8a4.rlib" "/candle-vllm/target/release/deps/libcandle_core-d2f01b6e6a29d888.rlib" "/candle-vllm/target/release/deps/libmemmap2-4476da1f91fb3603.rlib" "/candle-vllm/target/release/deps/libzip-9bf92410c307c36c.rlib" "/candle-vllm/target/release/deps/libpbkdf2-bfe2a8675cfe3dd6.rlib" "/candle-vllm/target/release/deps/libsha2-7f594f901cd89567.rlib" "/candle-vllm/target/release/deps/libpassword_hash-2fa33ff8d4990779.rlib" "/candle-vllm/target/release/deps/libbase64ct-760f27bcfd4054ae.rlib" "/candle-vllm/target/release/deps/libzstd-bafef58bb20c82a7.rlib" "/candle-vllm/target/release/deps/libzstd_safe-2c41e8f78c52fdfc.rlib" "/candle-vllm/target/release/deps/libbzip2-b94c5c5e7c15f010.rlib" "/candle-vllm/target/release/deps/libbzip2_sys-a158ea0d0289b351.rlib" "/candle-vllm/target/release/deps/libaes-dc1bc8251226040a.rlib" "/candle-vllm/target/release/deps/libcipher-eeb8ea70098f4f7f.rlib" "/candle-vllm/target/release/deps/libinout-5e79d2c693701e41.rlib" "/candle-vllm/target/release/deps/libhmac-246f344022381f5d.rlib" "/candle-vllm/target/release/deps/libconstant_time_eq-742a8ca43fc4b3c6.rlib" "/candle-vllm/target/release/deps/libyoke-b5cb326284cb506c.rlib" "/candle-vllm/target/release/deps/libzerofrom-72df68927b68a064.rlib" "/candle-vllm/target/release/deps/libstable_deref_trait-76725faa25d9c59b.rlib" "/candle-vllm/target/release/deps/libthiserror-7cc4f2a96da73a94.rlib" "/candle-vllm/target/release/deps/libsafetensors-b94965e86f7ef122.rlib" "/candle-vllm/target/release/deps/libcudarc-bb4cc1d0d1d68ba3.rlib" "/candle-vllm/target/release/deps/libcandle_kernels-af06d5fd4a087af6.rlib" "/candle-vllm/target/release/deps/libgemm-9939fb772d1ff792.rlib" "/candle-vllm/target/release/deps/libgemm_c32-cba446e570d4386d.rlib" "/candle-vllm/target/release/deps/libgemm_c64-701b72db790c5491.rlib" "/candle-vllm/target/release/deps/libgemm_f64-132035f8fb79f58d.rlib" "/candle-vllm/target/release/deps/libgemm_f16-a17195123a2b5a97.rlib" "/candle-vllm/target/release/deps/libgemm_f32-43dd1a29089d0d80.rlib" "/candle-vllm/target/release/deps/libgemm_common-888ab4912d03277a.rlib" "/candle-vllm/target/release/deps/libpulp-c51f68967478b6aa.rlib" "/candle-vllm/target/release/deps/libnum_complex-9293d6ad98d7b1c3.rlib" "/candle-vllm/target/release/deps/libdyn_stack-e01f3657ea7d975f.rlib" "/candle-vllm/target/release/deps/libreborrow-77659d577c4b718c.rlib" "/candle-vllm/target/release/deps/libraw_cpuid-b9cfe85e371d3083.rlib" "/candle-vllm/target/release/deps/librayon-7e6c7f8c76536947.rlib" "/candle-vllm/target/release/deps/librayon_core-2fef7474b3331466.rlib" "/candle-vllm/target/release/deps/libcrossbeam_deque-f3876680669c2c7d.rlib" "/candle-vllm/target/release/deps/libcrossbeam_epoch-d5f20c1ae49163b7.rlib" "/candle-vllm/target/release/deps/libmemoffset-b4fab92a5d1a5e30.rlib" "/candle-vllm/target/release/deps/libcrossbeam_utils-1d67d2d362ef675e.rlib" "/candle-vllm/target/release/deps/libeither-c016b57e73ba30c1.rlib" "/candle-vllm/target/release/deps/libbyteorder-8bf78fc69cf5b0a1.rlib" "/candle-vllm/target/release/deps/libhalf-82866db1aa6c7f3e.rlib" "/candle-vllm/target/release/deps/librand_distr-b111214f51586c69.rlib" "/candle-vllm/target/release/deps/libnum_traits-28ee9b33f1e53f29.rlib" "/candle-vllm/target/release/deps/libbytemuck-7eee2fa1f516b4ce.rlib" "/candle-vllm/target/release/deps/libactix_web-0a08fb87679df924.rlib" "/candle-vllm/target/release/deps/liburl-1bbf839f22bd1732.rlib" "/candle-vllm/target/release/deps/libidna-fb425d18121613f1.rlib" "/candle-vllm/target/release/deps/libunicode_normalization-7972d0be1c38ac31.rlib" "/candle-vllm/target/release/deps/libtinyvec-61debd23e06e16bf.rlib" "/candle-vllm/target/release/deps/libtinyvec_macros-f326b6a6f0ca8a7b.rlib" "/candle-vllm/target/release/deps/libunicode_bidi-9dc6f963fdeb5a21.rlib" "/candle-vllm/target/release/deps/libserde_urlencoded-9f88ee3d21b5ec1b.rlib" "/candle-vllm/target/release/deps/libform_urlencoded-3e169fc285508f2a.rlib" "/candle-vllm/target/release/deps/libserde_json-2daaa0f082f50c3a.rlib" "/candle-vllm/target/release/deps/libryu-8b05c69dcf279a6f.rlib" "/candle-vllm/target/release/deps/libactix_server-e79c728840296968.rlib" "/candle-vllm/target/release/deps/libactix_router-48a733d95bd3dd5e.rlib" "/candle-vllm/target/release/deps/libregex-c78c6a0d40f8f119.rlib" "/candle-vllm/target/release/deps/libregex_automata-3822bb291a95f096.rlib" "/candle-vllm/target/release/deps/libaho_corasick-6f9c3d032c4f562f.rlib" "/candle-vllm/target/release/deps/libregex_syntax-3dd804a409b2c545.rlib" "/candle-vllm/target/release/deps/libserde-23513cb3b07422f8.rlib" "/candle-vllm/target/release/deps/libcookie-30bd32d9b0d08b83.rlib" "/candle-vllm/target/release/deps/libtime-bc85cd6997494558.rlib" "/candle-vllm/target/release/deps/libtime_core-531fb2a2b6009484.rlib" "/candle-vllm/target/release/deps/libderanged-5409594f6406082d.rlib" "/candle-vllm/target/release/deps/libpowerfmt-c4543fc1903272c6.rlib" "/candle-vllm/target/release/deps/libactix_http-f7b0baf59fd7bb10.rlib" "/candle-vllm/target/release/deps/librand-aa6ddb6627b48b96.rlib" "/candle-vllm/target/release/deps/librand_chacha-fa47a10cc5e59439.rlib" "/candle-vllm/target/release/deps/libppv_lite86-9a645f708eed4e1c.rlib" "/candle-vllm/target/release/deps/librand_core-479671a2b8263665.rlib" "/candle-vllm/target/release/deps/libhttparse-699e93ce2c2e7905.rlib" "/candle-vllm/target/release/deps/libbrotli-df4299509820f939.rlib" "/candle-vllm/target/release/deps/libbrotli_decompressor-0212e4cdb0da1245.rlib" "/candle-vllm/target/release/deps/liballoc_stdlib-fc777d5f3c59a235.rlib" "/candle-vllm/target/release/deps/liballoc_no_stdlib-f497a54db348ea9b.rlib" "/candle-vllm/target/release/deps/libhttpdate-5f8e81ac577420b0.rlib" "/candle-vllm/target/release/deps/libsha1-ad6469ba6b8b2240.rlib" "/candle-vllm/target/release/deps/libcpufeatures-dcef25221428931f.rlib" "/candle-vllm/target/release/deps/libdigest-f32a2ccccbd945ab.rlib" "/candle-vllm/target/release/deps/libsubtle-910e19b9d08b2799.rlib" "/candle-vllm/target/release/deps/libblock_buffer-2ad0dde06bca4c37.rlib" "/candle-vllm/target/release/deps/libcrypto_common-30c46997c474a2db.rlib" "/candle-vllm/target/release/deps/libgeneric_array-95ff38f8e6dc2014.rlib" "/candle-vllm/target/release/deps/libtypenum-ddf8574aa94ffabe.rlib" "/candle-vllm/target/release/deps/libbase64-daaf16d87f9b4835.rlib" "/candle-vllm/target/release/deps/liblocal_channel-5501da97fbe12c8a.rlib" "/candle-vllm/target/release/deps/libbytestring-4d1e0f611bab987e.rlib" "/candle-vllm/target/release/deps/libencoding_rs-c048082deb3a71c3.rlib" "/candle-vllm/target/release/deps/liblanguage_tags-e0dfc52f86f9b27a.rlib" "/candle-vllm/target/release/deps/libahash-a28674307e9664ad.rlib" "/candle-vllm/target/release/deps/libgetrandom-b24cab7002c3530b.rlib" "/candle-vllm/target/release/deps/libzerocopy-63825396d720b9a6.rlib" "/candle-vllm/target/release/deps/libmime-04e6f00618993e67.rlib" "/candle-vllm/target/release/deps/libpercent_encoding-d54414372a2980de.rlib" "/candle-vllm/target/release/deps/libh2-27cdaea5e3d2147c.rlib" "/candle-vllm/target/release/deps/libindexmap-fcdde0ade0e1bfe3.rlib" "/candle-vllm/target/release/deps/libequivalent-8a25e166243cfe94.rlib" "/candle-vllm/target/release/deps/libhashbrown-aee95c0614bccf63.rlib" "/candle-vllm/target/release/deps/libfutures_util-98b8b67b3d434750.rlib" "/candle-vllm/target/release/deps/libfutures_io-bbce8973c99e7ece.rlib" "/candle-vllm/target/release/deps/libslab-490ef311b9a84e0e.rlib" "/candle-vllm/target/release/deps/libfutures_channel-6d294bf595dec06a.rlib" "/candle-vllm/target/release/deps/libfutures_task-0a7c23a0933dbcaa.rlib" "/candle-vllm/target/release/deps/libpin_utils-185c55cbe9ca2fff.rlib" "/candle-vllm/target/release/deps/libbitflags-1029aec9c38cde73.rlib" "/candle-vllm/target/release/deps/libzstd-242538c7759a4fa6.rlib" "/candle-vllm/target/release/deps/libzstd_safe-d25e92a1d04503ec.rlib" "/candle-vllm/target/release/deps/libzstd_sys-a6ec9cf883e86b56.rlib" "/candle-vllm/target/release/deps/libflate2-b67596bfbb64de8d.rlib" "/candle-vllm/target/release/deps/libminiz_oxide-2b969af90226827f.rlib" "/candle-vllm/target/release/deps/libsimd_adler32-d1dbd8e6b06bf162.rlib" "/candle-vllm/target/release/deps/libcrc32fast-ceb628e76fc0bab0.rlib" "/candle-vllm/target/release/deps/libactix_service-dfc20131f5ba36d4.rlib" "/candle-vllm/target/release/deps/libactix_codec-f3cae536aed1196d.rlib" "/candle-vllm/target/release/deps/libtokio_util-88b2eabf4483c1ed.rlib" "/candle-vllm/target/release/deps/libtracing-9e7a6177765350ac.rlib" "/candle-vllm/target/release/deps/libtracing_core-c5e9157560beafe6.rlib" "/candle-vllm/target/release/deps/libonce_cell-4b31816a5aa6274f.rlib" "/candle-vllm/target/release/deps/libmemchr-38d4fc2a3522aa15.rlib" "/candle-vllm/target/release/deps/libfutures_sink-78114cacf22202c2.rlib" "/candle-vllm/target/release/deps/libbitflags-b9815c55ec510696.rlib" "/candle-vllm/target/release/deps/libactix_utils-ec862be5af373362.rlib" "/candle-vllm/target/release/deps/liblocal_waker-7857496d2dec9a57.rlib" "/candle-vllm/target/release/deps/libactix_rt-0ffc3a15823d1322.rlib" "/candle-vllm/target/release/deps/libtokio-b67279acab90ede3.rlib" "/candle-vllm/target/release/deps/libsignal_hook_registry-a773ced30481d3cb.rlib" "/candle-vllm/target/release/deps/libnum_cpus-fbaf57124b2a0166.rlib" "/candle-vllm/target/release/deps/libsocket2-8e37cfa1c7015c6b.rlib" "/candle-vllm/target/release/deps/libmio-81de974463968f98.rlib" "/candle-vllm/target/release/deps/liblog-35f97248cb2ec82c.rlib" "/candle-vllm/target/release/deps/libparking_lot-e183fcd4a13bd183.rlib" "/candle-vllm/target/release/deps/libparking_lot_core-5fbb54b30e35e540.rlib" "/candle-vllm/target/release/deps/liblibc-d38dc52f94735460.rlib" "/candle-vllm/target/release/deps/libcfg_if-88c619515d65e3f1.rlib" "/candle-vllm/target/release/deps/libsmallvec-e35ec471a6514672.rlib" "/candle-vllm/target/release/deps/liblock_api-920512de5989abb2.rlib" "/candle-vllm/target/release/deps/libscopeguard-6208b4062bcdc2b1.rlib" "/candle-vllm/target/release/deps/libpin_project_lite-42a553ee08f02ebb.rlib" "/candle-vllm/target/release/deps/libfutures_core-b87582f06d7f1343.rlib" "/candle-vllm/target/release/deps/libhttp-b738399ec4ab1c60.rlib" "/candle-vllm/target/release/deps/libitoa-dcbca83b54db3306.rlib" "/candle-vllm/target/release/deps/libbytes-8c2bf1b211f72910.rlib" "/candle-vllm/target/release/deps/libfnv-ffe196e20ea2a648.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-9c342d6596ca77d8.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-35e6faa0abf08dd1.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-6242b5524a2684de.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-94511439d510df36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-1923a594ddedab24.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-5b476927cd520d76.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-6b4664d28b4dc07b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-4d7e14ee42b44abc.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-94e04d08d317eb2b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-7e3a1db27b23a8ee.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-0651af3c34a1e4b9.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-e5da8ecb95d2de36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-052b86aa844a2857.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-bbd2a157557b773d.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-f47279717d0e1831.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-d30e243a979711ec.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-18929aabe36e3f57.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-f9f41fbdedfbfafb.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-b26982894e484f03.rlib" "-Wl,-Bdynamic" "-lssl" "-lcrypto" "-lflashattention" "-lcudart" "-lstdc++" "-lstdc++" "-lcuda" "-lnccl" "-lnvrtc" "-lcurand" "-lcublas" "-lcublasLt" "-lcudnn" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-nodefaultlibs"
  = note: /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_api.o): relocation R_X86_64_32 against `.nvFatBinSegment' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim128_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi128ELi128ELi64ELi4ES2_EELb0ELb0ELb1ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim160_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi160ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim192_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim224_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim256_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi256ELi64ELi64ELi4ES2_EELb0ELb0ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim32_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim64_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim96_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim128_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi128ELi128ELi64ELi4ES2_EELb0ELb0ELb1ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim160_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi160ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim192_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim224_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim256_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi256ELi64ELi64ELi4ES2_EELb0ELb0ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim32_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim64_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim96_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          collect2: error: ld returned 1 exit status

error: could not compile `candle-vllm` (bin "candle-vllm") due to previous error
[root@95e9d872d994 candle-vllm]# PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
[root@95e9d872d994 candle-vllm]# command -v cc
/usr/bin/cc
[root@95e9d872d994 candle-vllm]# cc --version
cc (GCC) 11.4.1 20230605 (Red Hat 11.4.1-2)

Maybe I shouldn't use the flash-attn feature? Thanks for any suggestions or information.

EricLBuehler commented 7 months ago

To me, this looks like a candle flash attention compilation error. However, it may be because I compile CUDA kernels, too. Could you try compilation without flash-attn and let me know if that breaks?

ivanbaldo commented 7 months ago

Thanks Eric!

Restarting from scratch but with RUN cargo build --release --features cuda,cudnn,nccl fails exactly the same, it seems to bring that dependency anyway.

cargo update shows: Updating candle-flash-attn v0.3.2 (https://github.com/huggingface/candle.git#94817dac) -> #9e824ec8 Changes can be seen in https://github.com/huggingface/candle/compare/94817dac..9e824ec8 . I will give it a try tomorrow (slow laptop here...).

Thanks for your help!!!

ivanbaldo commented 7 months ago

Hello! cargo update didn't work, this time with this error:

error[E0277]: expected a `Fn<(&candle_core::Tensor,)>` closure, found `BatchNorm`
    --> /root/.cargo/git/checkouts/candle-lora-e71fb47097131b72/8b516d4/candle-lora-transformers/src/resnet.rs:87:61
     |
87   |         Ok(UnsyncFunc::new(move |xs| xs.apply(&conv)?.apply(&bn)))
     |                                                       ----- ^^^ expected an `Fn<(&candle_core::Tensor,)>` closure, found `BatchNorm`
     |                                                       |
     |                                                       required by a bound introduced by this call
     |
     = help: the trait `for<'a> Fn<(&'a candle_core::Tensor,)>` is not implemented for `BatchNorm`
     = help: the following other types implement trait `Module`:
               AttentionBlock
               ClipTextTransformer
               Conv1d
               ConvTranspose1d
               ConvTranspose2d
               DownEncoderBlock2D
               EfficientNet
               Func<'a>
             and 52 others
     = note: required for `BatchNorm` to implement `Module`
note: required by a bound in `candle_core::Tensor::apply`
    --> /root/.cargo/git/checkouts/candle-0c2b4fa9e5801351/9e824ec/candle-core/src/tensor.rs:2337:21
     |
2337 |     pub fn apply<M: crate::Module>(&self, m: &M) -> Result<Self> {
     |                     ^^^^^^^^^^^^^ required by this bound in `Tensor::apply`

So it seems some APIs changed incompatibly and so users need to be updated for them. Thanks.

EricLBuehler commented 7 months ago

Thank you for notifying me, my CI infra does not run periodically and did not catch this. Please see huggingface/candle#1647, and feel free to add anything.

ivanbaldo commented 7 months ago

Now I tried running cargo update -p candle-flash-attn but it fails in the same way as cargo update. Thanks for reporting the issue to candle, will monitor it! I guess then that we need to wait for a fix for that first and then see if with the updates everything works correctly.

EricLBuehler commented 7 months ago

Yes, the main problem is that I cannot find the trait bound causing the issue.

EricLBuehler commented 7 months ago

candle-lora is used once, below, so I will look into removing it as a dependency. https://github.com/EricLBuehler/candle-vllm/blob/9b0d89f1354cd52495162c65293fba10eff717c9/src/openai/pipelines/llama.rs#L25

ivanbaldo commented 7 months ago

Meanwhile I am trying with this changes just in case:

diff --git a/Cargo.toml b/Cargo.toml
index f159df4..95b9e90 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -8,17 +8,17 @@ edition = "2021"
 [dependencies]
 actix-web = "4.4.0"
 anyhow = "1.0.75"
-candle-core = { git = "https://github.com/huggingface/candle.git", version = "0.3.0" }
-candle-examples = { git = "https://github.com/huggingface/candle.git", version = "0.3.0" }
+candle-core = "0.3.3"
+candle-examples = "0.3.2"
 candle-lora = { git = "https://github.com/EricLBuehler/candle-lora.git", version = "0.2.0" }
 candle-lora-macro = { git = "https://github.com/EricLBuehler/candle-lora.git", version = "0.2.0" }
 candle-lora-transformers = { git = "https://github.com/EricLBuehler/candle-lora.git", version = "0.2.0" }
-candle-nn = { git = "https://github.com/huggingface/candle.git", version = "0.3.0" }
+candle-nn = "0.3.3"
 dyn-fmt = "0.4.0"
 serde = { version = "1.0.190", features = ["serde_derive"] }
 tokenizers = "0.15.0"
 uuid = { version = "1.5.0", features = ["v4"] }
-candle-transformers = { git = "https://github.com/huggingface/candle.git", version = "0.3.0" }
+candle-transformers = "0.3.3"
 hf-hub = "0.3.2"
 serde_json = "1.0.108"
 derive_more = "0.99.17"
@@ -26,7 +26,7 @@ accelerate-src = { version = "0.3.2", optional = true }
 intel-mkl-src = { version = "0.8.1", features = ["mkl-static-lp64-iomp"], optional = true }
 cudarc = { version = "0.9.14", features = ["f16"], optional = true }
 half = { version = "2.3.1", features = ["num-traits", "use-intrinsics", "rand_distr"] }
-candle-flash-attn = { git = "https://github.com/huggingface/candle.git", version = "0.3.0", optional = true }
+candle-flash-attn = { version = "0.3.3", optional = true }
 clap = { version = "4.4.7", features = ["derive"] }
 candle-sampling = { git = "https://github.com/EricLBuehler/candle-sampling.git", version = "0.2.0" }
 futures = "0.3.29"
EricLBuehler commented 7 months ago

I fixed that bug - it was on the candle-lora side. Could you try again with the original Cargo.toml and after cargo update?

EricLBuehler commented 7 months ago

Looks like there is a dependency bug - I just pushed a fix.

ivanbaldo commented 7 months ago

Of course, will try again and let you know, this will take about 50 minutes or so. Thanks so much!!!

EricLBuehler commented 7 months ago

Ok, sounds good!

ivanbaldo commented 7 months ago

I rebuilt the container from scratch (new git clone etc.) but unfortunately it failed again with the same linking error:

error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "cc" "-m64" "/tmp/rustcN3D4xH/symbols.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.0.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.1.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.10.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.11.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.12.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.13.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.14.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.15.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.2.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.3.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.4.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.5.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.6.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.7.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.8.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.9.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.4psql9m0o7iw6sqs.rcgu.o" "-Wl,--as-needed" "-L" "/candle-vllm/target/release/deps" "-L" "/candle-vllm/target/release/build/zstd-sys-51991617680764ab/out" "-L" "/usr/local/cuda/lib64" "-L" "/usr/local/cuda/lib64/stubs" "-L" "/usr/local/cuda/targets/x86_64-linux" "-L" "/usr/local/cuda/targets/x86_64-linux/lib" "-L" "/usr/local/cuda/targets/x86_64-linux/lib/stubs" "-L" "/usr/lib" "-L" "/usr/lib64" "-L" "/candle-vllm/target/release/build/bzip2-sys-f7fb57a3f4e98cc1/out/lib" "-L" "/candle-vllm/target/release/build/ring-a59330cc6e943984/out" "-L" "/candle-vllm/target/release/build/lz4-sys-c90b3b6e3d6da391/out" "-L" "/candle-vllm/target/release/build/esaxx-rs-83f1f68488f360a8/out" "-L" "/candle-vllm/target/release/build/onig_sys-d0c2f3461f43020d/out" "-L" "/candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/candle-vllm/target/release/deps/libenv_logger-0f0fa188a1404846.rlib" "/candle-vllm/target/release/deps/libtermcolor-c53cf66b9b32e10f.rlib" "/candle-vllm/target/release/deps/libis_terminal-cdf9c5266fcbba03.rlib" "/candle-vllm/target/release/deps/librustix-a629012946c99e6d.rlib" "/candle-vllm/target/release/deps/liblinux_raw_sys-15bed2ca91cf42a8.rlib" "/candle-vllm/target/release/deps/libhumantime-1dc284c82c7f0559.rlib" "/candle-vllm/target/release/deps/libcandle_vllm-ce9b07d51787770c.rlib" "/candle-vllm/target/release/deps/libchrono-ae2c4cf3aacef826.rlib" "/candle-vllm/target/release/deps/libiana_time_zone-2bd86fbdc9e46a38.rlib" "/candle-vllm/target/release/deps/libhf_hub-b2415c762b503a90.rlib" "/candle-vllm/target/release/deps/libdirs-45aa89c180ae36f2.rlib" "/candle-vllm/target/release/deps/libdirs_sys-b0294348c2e4986c.rlib" "/candle-vllm/target/release/deps/liboption_ext-3db96de540040126.rlib" "/candle-vllm/target/release/deps/libureq-22a62ebb34562523.rlib" "/candle-vllm/target/release/deps/libnative_tls-addec962e00a97ff.rlib" "/candle-vllm/target/release/deps/libopenssl_probe-e135bf478bd9e62b.rlib" "/candle-vllm/target/release/deps/libopenssl-f7e740960c8b0b56.rlib" "/candle-vllm/target/release/deps/libforeign_types-434e4620cdd2963d.rlib" "/candle-vllm/target/release/deps/libforeign_types_shared-3cd91dddd8b3059a.rlib" "/candle-vllm/target/release/deps/libopenssl_sys-2724f2f05b6f6e71.rlib" "/candle-vllm/target/release/deps/libwebpki_roots-fb31dcc12f4e6db5.rlib" "/candle-vllm/target/release/deps/librustls-ca4a80b00d74d11d.rlib" "/candle-vllm/target/release/deps/libsct-d1a0a53864376724.rlib" "/candle-vllm/target/release/deps/libwebpki-8db93ee63982280a.rlib" "/candle-vllm/target/release/deps/libring-c45b21a3fb043429.rlib" "/candle-vllm/target/release/deps/libspin-a5bca8ced7fc453c.rlib" "/candle-vllm/target/release/deps/libuntrusted-766afbb3ef44c1d1.rlib" "/candle-vllm/target/release/deps/libcandle_lora_transformers-c49058ffb6d7068a.rlib" "/candle-vllm/target/release/deps/libtqdm-e47c7a840c2fc706.rlib" "/candle-vllm/target/release/deps/libcrossterm-f705860770d94db8.rlib" "/candle-vllm/target/release/deps/libsignal_hook_mio-ec3a5a299cc915e5.rlib" "/candle-vllm/target/release/deps/libsignal_hook-49df15a2181bf250.rlib" "/candle-vllm/target/release/deps/libanyhow-78648c12fa2eaee5.rlib" "/candle-vllm/target/release/deps/libcandle_lora-0543f2db3a02f6c2.rlib" "/candle-vllm/target/release/deps/libtrc-af4d2dc9e955d45c.rlib" "/candle-vllm/target/release/deps/libuuid-8e9abe15319c7747.rlib" "/candle-vllm/target/release/deps/libcandle_transformers-3a408f703fe757e5.rlib" "/candle-vllm/target/release/deps/libserde_plain-9edacf8e6b8b5e3b.rlib" "/candle-vllm/target/release/deps/libcandle_flash_attn-6ec38f8ed9aac30d.rlib" "/candle-vllm/target/release/deps/libdyn_fmt-ca01837b2f65b0b1.rlib" "/candle-vllm/target/release/deps/libfutures-813f484dc1c71e4c.rlib" "/candle-vllm/target/release/deps/libfutures_executor-cdd38bae408d4ce8.rlib" "/candle-vllm/target/release/deps/libcandle_sampling-07b86ed24f500345.rlib" "/candle-vllm/target/release/deps/libcandle_nn-3eaedbdadbe5fbb5.rlib" "/candle-vllm/target/release/deps/libtokenizers-61b7f12c56fed2c5.rlib" "/candle-vllm/target/release/deps/libesaxx_rs-c3b0fa8f52cc413c.rlib" "/candle-vllm/target/release/deps/libunicode_normalization_alignments-025da513407d9879.rlib" "/candle-vllm/target/release/deps/libspm_precompiled-8a5e3784a84b6fa0.rlib" "/candle-vllm/target/release/deps/libbase64-a00060132962802d.rlib" "/candle-vllm/target/release/deps/libunicode_segmentation-0609f6ce0b27032d.rlib" "/candle-vllm/target/release/deps/libnom-828591b7d6e9f08d.rlib" "/candle-vllm/target/release/deps/libunicode_categories-4b2d8309eb580595.rlib" "/candle-vllm/target/release/deps/libmonostate-121edb8fb43689e8.rlib" "/candle-vllm/target/release/deps/libmacro_rules_attribute-fbe2172e90fd6d9d.rlib" "/candle-vllm/target/release/deps/libindicatif-5ac26ff2181c3839.rlib" "/candle-vllm/target/release/deps/libportable_atomic-37fa7d733d3c2283.rlib" "/candle-vllm/target/release/deps/libnumber_prefix-fcbd61cd7f0fb674.rlib" "/candle-vllm/target/release/deps/libconsole-927989bf813852d8.rlib" "/candle-vllm/target/release/deps/libunicode_width-4a01194dbfae8c91.rlib" "/candle-vllm/target/release/deps/librayon_cond-ec5fdcb09b40065c.rlib" "/candle-vllm/target/release/deps/libitertools-87b264833edf6f52.rlib" "/candle-vllm/target/release/deps/libonig-40dabd6ed5124b91.rlib" "/candle-vllm/target/release/deps/libonig_sys-90597c1391bce008.rlib" "/candle-vllm/target/release/deps/libderive_builder-3471ddeab47c0b9a.rlib" "/candle-vllm/target/release/deps/liblazy_static-852800890c81fb22.rlib" "/candle-vllm/target/release/deps/libclap-23394ec333e54596.rlib" "/candle-vllm/target/release/deps/libclap_builder-41cde94296fdb820.rlib" "/candle-vllm/target/release/deps/libstrsim-bfb3799e9677cd4d.rlib" "/candle-vllm/target/release/deps/libanstream-d284661ab137b824.rlib" "/candle-vllm/target/release/deps/libanstyle_query-d08e7c102e46eb49.rlib" "/candle-vllm/target/release/deps/libcolorchoice-d9fe16d50a3dd803.rlib" "/candle-vllm/target/release/deps/libanstyle_parse-6ac7d6e179081361.rlib" "/candle-vllm/target/release/deps/libutf8parse-86e737e0d4678582.rlib" "/candle-vllm/target/release/deps/libclap_lex-3a6b7689365ae37a.rlib" "/candle-vllm/target/release/deps/libanstyle-9a261b265642b8a4.rlib" "/candle-vllm/target/release/deps/libcandle_core-d2f01b6e6a29d888.rlib" "/candle-vllm/target/release/deps/libmemmap2-4476da1f91fb3603.rlib" "/candle-vllm/target/release/deps/libzip-9bf92410c307c36c.rlib" "/candle-vllm/target/release/deps/libpbkdf2-bfe2a8675cfe3dd6.rlib" "/candle-vllm/target/release/deps/libsha2-7f594f901cd89567.rlib" "/candle-vllm/target/release/deps/libpassword_hash-2fa33ff8d4990779.rlib" "/candle-vllm/target/release/deps/libbase64ct-760f27bcfd4054ae.rlib" "/candle-vllm/target/release/deps/libzstd-bafef58bb20c82a7.rlib" "/candle-vllm/target/release/deps/libzstd_safe-2c41e8f78c52fdfc.rlib" "/candle-vllm/target/release/deps/libbzip2-b94c5c5e7c15f010.rlib" "/candle-vllm/target/release/deps/libbzip2_sys-a158ea0d0289b351.rlib" "/candle-vllm/target/release/deps/libaes-dc1bc8251226040a.rlib" "/candle-vllm/target/release/deps/libcipher-eeb8ea70098f4f7f.rlib" "/candle-vllm/target/release/deps/libinout-5e79d2c693701e41.rlib" "/candle-vllm/target/release/deps/libhmac-246f344022381f5d.rlib" "/candle-vllm/target/release/deps/libconstant_time_eq-742a8ca43fc4b3c6.rlib" "/candle-vllm/target/release/deps/libyoke-b5cb326284cb506c.rlib" "/candle-vllm/target/release/deps/libzerofrom-72df68927b68a064.rlib" "/candle-vllm/target/release/deps/libstable_deref_trait-76725faa25d9c59b.rlib" "/candle-vllm/target/release/deps/libthiserror-7cc4f2a96da73a94.rlib" "/candle-vllm/target/release/deps/libsafetensors-b94965e86f7ef122.rlib" "/candle-vllm/target/release/deps/libcudarc-bb4cc1d0d1d68ba3.rlib" "/candle-vllm/target/release/deps/libcandle_kernels-af06d5fd4a087af6.rlib" "/candle-vllm/target/release/deps/libgemm-9939fb772d1ff792.rlib" "/candle-vllm/target/release/deps/libgemm_c32-cba446e570d4386d.rlib" "/candle-vllm/target/release/deps/libgemm_c64-701b72db790c5491.rlib" "/candle-vllm/target/release/deps/libgemm_f64-132035f8fb79f58d.rlib" "/candle-vllm/target/release/deps/libgemm_f16-a17195123a2b5a97.rlib" "/candle-vllm/target/release/deps/libgemm_f32-43dd1a29089d0d80.rlib" "/candle-vllm/target/release/deps/libgemm_common-888ab4912d03277a.rlib" "/candle-vllm/target/release/deps/libpulp-c51f68967478b6aa.rlib" "/candle-vllm/target/release/deps/libnum_complex-9293d6ad98d7b1c3.rlib" "/candle-vllm/target/release/deps/libdyn_stack-e01f3657ea7d975f.rlib" "/candle-vllm/target/release/deps/libreborrow-77659d577c4b718c.rlib" "/candle-vllm/target/release/deps/libraw_cpuid-b9cfe85e371d3083.rlib" "/candle-vllm/target/release/deps/librayon-7e6c7f8c76536947.rlib" "/candle-vllm/target/release/deps/librayon_core-2fef7474b3331466.rlib" "/candle-vllm/target/release/deps/libcrossbeam_deque-f3876680669c2c7d.rlib" "/candle-vllm/target/release/deps/libcrossbeam_epoch-d5f20c1ae49163b7.rlib" "/candle-vllm/target/release/deps/libmemoffset-b4fab92a5d1a5e30.rlib" "/candle-vllm/target/release/deps/libcrossbeam_utils-1d67d2d362ef675e.rlib" "/candle-vllm/target/release/deps/libeither-c016b57e73ba30c1.rlib" "/candle-vllm/target/release/deps/libbyteorder-8bf78fc69cf5b0a1.rlib" "/candle-vllm/target/release/deps/libhalf-82866db1aa6c7f3e.rlib" "/candle-vllm/target/release/deps/librand_distr-b111214f51586c69.rlib" "/candle-vllm/target/release/deps/libnum_traits-28ee9b33f1e53f29.rlib" "/candle-vllm/target/release/deps/libbytemuck-7eee2fa1f516b4ce.rlib" "/candle-vllm/target/release/deps/libactix_web-0a08fb87679df924.rlib" "/candle-vllm/target/release/deps/liburl-1bbf839f22bd1732.rlib" "/candle-vllm/target/release/deps/libidna-fb425d18121613f1.rlib" "/candle-vllm/target/release/deps/libunicode_normalization-7972d0be1c38ac31.rlib" "/candle-vllm/target/release/deps/libtinyvec-61debd23e06e16bf.rlib" "/candle-vllm/target/release/deps/libtinyvec_macros-f326b6a6f0ca8a7b.rlib" "/candle-vllm/target/release/deps/libunicode_bidi-9dc6f963fdeb5a21.rlib" "/candle-vllm/target/release/deps/libserde_urlencoded-9f88ee3d21b5ec1b.rlib" "/candle-vllm/target/release/deps/libform_urlencoded-3e169fc285508f2a.rlib" "/candle-vllm/target/release/deps/libserde_json-2daaa0f082f50c3a.rlib" "/candle-vllm/target/release/deps/libryu-8b05c69dcf279a6f.rlib" "/candle-vllm/target/release/deps/libactix_server-e79c728840296968.rlib" "/candle-vllm/target/release/deps/libactix_router-48a733d95bd3dd5e.rlib" "/candle-vllm/target/release/deps/libregex-c78c6a0d40f8f119.rlib" "/candle-vllm/target/release/deps/libregex_automata-3822bb291a95f096.rlib" "/candle-vllm/target/release/deps/libaho_corasick-6f9c3d032c4f562f.rlib" "/candle-vllm/target/release/deps/libregex_syntax-3dd804a409b2c545.rlib" "/candle-vllm/target/release/deps/libserde-23513cb3b07422f8.rlib" "/candle-vllm/target/release/deps/libcookie-30bd32d9b0d08b83.rlib" "/candle-vllm/target/release/deps/libtime-bc85cd6997494558.rlib" "/candle-vllm/target/release/deps/libtime_core-531fb2a2b6009484.rlib" "/candle-vllm/target/release/deps/libderanged-5409594f6406082d.rlib" "/candle-vllm/target/release/deps/libpowerfmt-c4543fc1903272c6.rlib" "/candle-vllm/target/release/deps/libactix_http-f7b0baf59fd7bb10.rlib" "/candle-vllm/target/release/deps/librand-aa6ddb6627b48b96.rlib" "/candle-vllm/target/release/deps/librand_chacha-fa47a10cc5e59439.rlib" "/candle-vllm/target/release/deps/libppv_lite86-9a645f708eed4e1c.rlib" "/candle-vllm/target/release/deps/librand_core-479671a2b8263665.rlib" "/candle-vllm/target/release/deps/libhttparse-699e93ce2c2e7905.rlib" "/candle-vllm/target/release/deps/libbrotli-df4299509820f939.rlib" "/candle-vllm/target/release/deps/libbrotli_decompressor-0212e4cdb0da1245.rlib" "/candle-vllm/target/release/deps/liballoc_stdlib-fc777d5f3c59a235.rlib" "/candle-vllm/target/release/deps/liballoc_no_stdlib-f497a54db348ea9b.rlib" "/candle-vllm/target/release/deps/libhttpdate-5f8e81ac577420b0.rlib" "/candle-vllm/target/release/deps/libsha1-ad6469ba6b8b2240.rlib" "/candle-vllm/target/release/deps/libcpufeatures-dcef25221428931f.rlib" "/candle-vllm/target/release/deps/libdigest-f32a2ccccbd945ab.rlib" "/candle-vllm/target/release/deps/libsubtle-910e19b9d08b2799.rlib" "/candle-vllm/target/release/deps/libblock_buffer-2ad0dde06bca4c37.rlib" "/candle-vllm/target/release/deps/libcrypto_common-30c46997c474a2db.rlib" "/candle-vllm/target/release/deps/libgeneric_array-95ff38f8e6dc2014.rlib" "/candle-vllm/target/release/deps/libtypenum-ddf8574aa94ffabe.rlib" "/candle-vllm/target/release/deps/libbase64-daaf16d87f9b4835.rlib" "/candle-vllm/target/release/deps/liblocal_channel-5501da97fbe12c8a.rlib" "/candle-vllm/target/release/deps/libbytestring-4d1e0f611bab987e.rlib" "/candle-vllm/target/release/deps/libencoding_rs-c048082deb3a71c3.rlib" "/candle-vllm/target/release/deps/liblanguage_tags-e0dfc52f86f9b27a.rlib" "/candle-vllm/target/release/deps/libahash-a28674307e9664ad.rlib" "/candle-vllm/target/release/deps/libgetrandom-b24cab7002c3530b.rlib" "/candle-vllm/target/release/deps/libzerocopy-63825396d720b9a6.rlib" "/candle-vllm/target/release/deps/libmime-04e6f00618993e67.rlib" "/candle-vllm/target/release/deps/libpercent_encoding-d54414372a2980de.rlib" "/candle-vllm/target/release/deps/libh2-27cdaea5e3d2147c.rlib" "/candle-vllm/target/release/deps/libindexmap-fcdde0ade0e1bfe3.rlib" "/candle-vllm/target/release/deps/libequivalent-8a25e166243cfe94.rlib" "/candle-vllm/target/release/deps/libhashbrown-aee95c0614bccf63.rlib" "/candle-vllm/target/release/deps/libfutures_util-98b8b67b3d434750.rlib" "/candle-vllm/target/release/deps/libfutures_io-bbce8973c99e7ece.rlib" "/candle-vllm/target/release/deps/libslab-490ef311b9a84e0e.rlib" "/candle-vllm/target/release/deps/libfutures_channel-6d294bf595dec06a.rlib" "/candle-vllm/target/release/deps/libfutures_task-0a7c23a0933dbcaa.rlib" "/candle-vllm/target/release/deps/libpin_utils-185c55cbe9ca2fff.rlib" "/candle-vllm/target/release/deps/libbitflags-1029aec9c38cde73.rlib" "/candle-vllm/target/release/deps/libzstd-242538c7759a4fa6.rlib" "/candle-vllm/target/release/deps/libzstd_safe-d25e92a1d04503ec.rlib" "/candle-vllm/target/release/deps/libzstd_sys-a6ec9cf883e86b56.rlib" "/candle-vllm/target/release/deps/libflate2-b67596bfbb64de8d.rlib" "/candle-vllm/target/release/deps/libminiz_oxide-2b969af90226827f.rlib" "/candle-vllm/target/release/deps/libsimd_adler32-d1dbd8e6b06bf162.rlib" "/candle-vllm/target/release/deps/libcrc32fast-ceb628e76fc0bab0.rlib" "/candle-vllm/target/release/deps/libactix_service-dfc20131f5ba36d4.rlib" "/candle-vllm/target/release/deps/libactix_codec-f3cae536aed1196d.rlib" "/candle-vllm/target/release/deps/libtokio_util-88b2eabf4483c1ed.rlib" "/candle-vllm/target/release/deps/libtracing-9e7a6177765350ac.rlib" "/candle-vllm/target/release/deps/libtracing_core-c5e9157560beafe6.rlib" "/candle-vllm/target/release/deps/libonce_cell-4b31816a5aa6274f.rlib" "/candle-vllm/target/release/deps/libmemchr-38d4fc2a3522aa15.rlib" "/candle-vllm/target/release/deps/libfutures_sink-78114cacf22202c2.rlib" "/candle-vllm/target/release/deps/libbitflags-b9815c55ec510696.rlib" "/candle-vllm/target/release/deps/libactix_utils-ec862be5af373362.rlib" "/candle-vllm/target/release/deps/liblocal_waker-7857496d2dec9a57.rlib" "/candle-vllm/target/release/deps/libactix_rt-0ffc3a15823d1322.rlib" "/candle-vllm/target/release/deps/libtokio-b67279acab90ede3.rlib" "/candle-vllm/target/release/deps/libsignal_hook_registry-a773ced30481d3cb.rlib" "/candle-vllm/target/release/deps/libnum_cpus-fbaf57124b2a0166.rlib" "/candle-vllm/target/release/deps/libsocket2-8e37cfa1c7015c6b.rlib" "/candle-vllm/target/release/deps/libmio-81de974463968f98.rlib" "/candle-vllm/target/release/deps/liblog-35f97248cb2ec82c.rlib" "/candle-vllm/target/release/deps/libparking_lot-e183fcd4a13bd183.rlib" "/candle-vllm/target/release/deps/libparking_lot_core-5fbb54b30e35e540.rlib" "/candle-vllm/target/release/deps/liblibc-d38dc52f94735460.rlib" "/candle-vllm/target/release/deps/libcfg_if-88c619515d65e3f1.rlib" "/candle-vllm/target/release/deps/libsmallvec-e35ec471a6514672.rlib" "/candle-vllm/target/release/deps/liblock_api-920512de5989abb2.rlib" "/candle-vllm/target/release/deps/libscopeguard-6208b4062bcdc2b1.rlib" "/candle-vllm/target/release/deps/libpin_project_lite-42a553ee08f02ebb.rlib" "/candle-vllm/target/release/deps/libfutures_core-b87582f06d7f1343.rlib" "/candle-vllm/target/release/deps/libhttp-b738399ec4ab1c60.rlib" "/candle-vllm/target/release/deps/libitoa-dcbca83b54db3306.rlib" "/candle-vllm/target/release/deps/libbytes-8c2bf1b211f72910.rlib" "/candle-vllm/target/release/deps/libfnv-ffe196e20ea2a648.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-9c342d6596ca77d8.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-35e6faa0abf08dd1.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-6242b5524a2684de.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-94511439d510df36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-1923a594ddedab24.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-5b476927cd520d76.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-6b4664d28b4dc07b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-4d7e14ee42b44abc.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-94e04d08d317eb2b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-7e3a1db27b23a8ee.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-0651af3c34a1e4b9.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-e5da8ecb95d2de36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-052b86aa844a2857.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-bbd2a157557b773d.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-f47279717d0e1831.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-d30e243a979711ec.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-18929aabe36e3f57.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-f9f41fbdedfbfafb.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-b26982894e484f03.rlib" "-Wl,-Bdynamic" "-lssl" "-lcrypto" "-lflashattention" "-lcudart" "-lstdc++" "-lstdc++" "-lcuda" "-lnccl" "-lnvrtc" "-lcurand" "-lcublas" "-lcublasLt" "-lcudnn" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-nodefaultlibs"
  = note: /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_api.o): relocation R_X86_64_32 against `.nvFatBinSegment' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim128_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi128ELi128ELi64ELi4ES2_EELb0ELb0ELb1ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
etc...

I will try with cargo update again but need to go so I will let you know tomorrow probably. Thanks!!!

ivanbaldo commented 7 months ago

Strangely with cargo update and the latest git from this repo it failed again on the linking phase. Some extracts from the new Cargo.toml:

name = "candle-flash-attn"
version = "0.3.3"
source = "git+https://github.com/huggingface/candle.git#9e824ec810fbe490f21b7404058b6cb47d24c6cf"

name = "candle-lora"
version = "0.2.0"
source = "git+https://github.com/EricLBuehler/candle-lora.git#bb518c14dc15e322288f64fb2158e44f49cc3369"
ivanbaldo commented 7 months ago

Using FROM nvidia/cuda:12.3.1-devel-rockylinux8 (Rocky 8 instead of 9) failed too in the same way during link time.

EricLBuehler commented 7 months ago

Ok, this seems like a general linking problem. I'll try to reproduce it tonight, as I plan on working on the CUDA kernels.

ivanbaldo commented 7 months ago

Based on your VSCode build container Dockerfile in https://github.com/EricLBuehler/candle-vllm/blob/master/Dockerfile I run this commands exactly and it worked:

docker run --rm -it pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel bash -i
apt-get update
apt-get install -y \
    build-essential \
    git \
    curl \
    openssl \
    libssl-dev \
    pkg-config \
    wget
curl https://sh.rustup.rs -sSf | bash -s -- -y && \
    echo 'source $HOME/.cargo/env' >> $HOME/.bashrc && \
    source $HOME/.bashrc
git clone https://github.com/EricLBuehler/candle-vllm
cd candle-vllm
export CUDA_COMPUTE_CAP=86
cargo build --release --features cuda,cudnn,flash-attn,nccl

That tag is old, based in Ubuntu 20.04.6 and the older Pytorch 2.1.2. I will try with pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel and see, thanks!

EricLBuehler commented 7 months ago

Apologies for the confusion; candle-vllm is not meant to be built with Pytorch, as we use Candle. It should not require a docker file at all. Instead, please follow the README instructions.

ivanbaldo commented 7 months ago

I know but I used the devcontainer in your repo as base and it worked, so I guess I can try different (older probably) versions, etc. and make it work in a smaller container without PyTorch. I tried to run it with HF_TOKEN=xxxx target/release/candle-vllm --hf-token HF_TOKEN --port 8080 llama7b --repeat-last-n 4096 but it failed with: Error: APIError { data: "request error: https://huggingface.co/meta-llama/Llama-27b-chat-hf/resolve/main/tokenizer.json: status code 404" } A possible alternative could be: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/raw/main/tokenizer.json But anyway maybe this is already fixed by a cargo update which I forgot to run in this case. If it's already fixed then maybe a new Cargo.lock commit should be done to avoid doing the cargo update? Thanks for all your help Eric!!!

ivanbaldo commented 7 months ago

Adding dnf upgrade it fails differently, this is the current version:

FROM nvidia/cuda:12.3.1-devel-rockylinux9
ARG USERID=1000
ARG CUDA_COMPUTE_CAP
RUN dnf upgrade -y && dnf clean all && rm -rf /var/cache/dnf/*
RUN dnf install -y cargo libcudnn8-devel openssl-devel git && dnf clean all && \
    rm -rf /var/cache/dnf/*
RUN git clone https://github.com/EricLBuehler/candle-vllm
WORKDIR /candle-vllm
RUN cargo update
#RUN cargo build --release --features cuda,cudnn,flash-attn,nccl
RUN cargo install --path . --features cuda,cudnn,flash-attn,nccl
RUN adduser -u $USERID user
USER user
ENTRYPOINT ["/candle-vllm/target/release/candle-vllm"]
CMD ["--hf-token", "HF_TOKEN", "--port", "8080", "llama7b", "--repeat-last-n", "4096"]

And can be built with:

docker build --build-arg USERID=$(id -u) --build-arg \
   CUDA_COMPUTE_CAP=$(nvidia-smi --query-gpu=compute_cap --format=csv | tail -n1 | tr -d .) \
   -t local/candle-vllm .
So the current failure is this:
error[E0412]: cannot find type `Tensor` in this scope
  --> src/backend/mod.rs:61:38
   |
61 | fn dispatch_get_cuda_pointer(tensor: Tensor) -> u64 {
   |                                      ^^^^^^ not found in this scope
   |
help: consider importing this struct
   |
80 + use candle_core::Tensor;
   |

error[E0412]: cannot find type `bf16` in this scope
  --> src/backend/mod.rs:63:43
   |
63 |         DType::BF16 => get_cuda_pointer::<bf16>(tensor),
   |                                           ^^^^ not found in this scope
   |
help: consider importing this struct
   |
80 + use half::bf16;
   |

error[E0412]: cannot find type `f16` in this scope
  --> src/backend/mod.rs:64:42
   |
64 |         DType::F16 => get_cuda_pointer::<f16>(tensor),
   |                                          ^^^
   |
help: a builtin type with a similar name exists
   |
64 |         DType::F16 => get_cuda_pointer::<i16>(tensor),
   |                                          ~~~
help: consider importing this struct
   |
80 + use half::f16;
   |

error[E0405]: cannot find trait `CudaDType` in this scope
  --> src/backend/mod.rs:73:24
   |
73 | fn get_cuda_pointer<T: CudaDType>(tensor: Tensor) -> u64 {
   |                        ^^^^^^^^^ not found in this scope
   |
help: consider importing this trait
   |
80 + use candle_core::cuda_backend::CudaDType;
   |

error[E0412]: cannot find type `Tensor` in this scope
  --> src/backend/mod.rs:73:43
   |
73 | fn get_cuda_pointer<T: CudaDType>(tensor: Tensor) -> u64 {
   |                                           ^^^^^^ not found in this scope
   |
help: consider importing this struct
   |
80 + use candle_core::Tensor;
   |

error[E0433]: failed to resolve: use of undeclared type `Storage`
  --> src/backend/mod.rs:75:9
   |
75 |         Storage::Cuda(cuda_storage) => *cuda_storage.as_cuda_slice::<T>().unwrap().device_ptr(),
   |         ^^^^^^^ use of undeclared type `Storage`
   |
help: consider importing this enum
   |
80 + use candle_core::Storage;
   |

warning: unused imports: `bf16`, `f16`
  --> src/backend/cache.rs:10:12
   |
10 | use half::{bf16, f16};
   |            ^^^^  ^^^
   |
   = note: `#[warn(unused_imports)]` on by default

Some errors have detailed explanations: E0405, E0412, E0433.
For more information about an error, try `rustc --explain E0405`.
warning: `candle-vllm` (lib) generated 1 warning
error: could not compile `candle-vllm` (lib) due to 6 previous errors; 1 warning emitted
warning: build failed, waiting for other jobs to finish...
error: failed to compile `candle-vllm v0.1.0 (/candle-vllm)`, intermediate artifacts can be found at `/candle-vllm/target`

I will try to instead of using rustc 1.71.1 from Rocky 9 to install the latest version:

curl https://sh.rustup.rs -sSf | bash -s -- -y && \
    echo 'source $HOME/.cargo/env' >> $HOME/.bashrc && \
    source $HOME/.bashrc

Have a good weekend!

EricLBuehler commented 7 months ago

I just pushed a commit which should fix this, could you try to build again?

ivanbaldo commented 7 months ago

Hello Eric. Thanks for the updates and sorry for the delay. Unfortunately it still doesn't compile because of a different error this time:

error: failed to run custom build command for `candle-vllm v0.1.0 (/candle-vllm)`

Caused by:
  process didn't exit successfully: `/candle-vllm/target/release/build/candle-vllm-2ea58adaa28146b1/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP
  cargo:rustc-env=CUDA_COMPUTE_CAP=86

  --- stderr
  kernels/rotary_embedding_kernel.cu(81): error: identifier "scalar_t" is undefined
      scalar_t* __restrict__ query,
      ^

  kernels/rotary_embedding_kernel.cu(82): error: identifier "scalar_t" is undefined
      scalar_t* __restrict__ key,
      ^

  kernels/rotary_embedding_kernel.cu(83): error: identifier "scalar_t" is undefined
      const scalar_t* __restrict__ cos_sin_cache,
            ^

  kernels/rotary_embedding_kernel.cu(90): error: identifier "scalar_t" is undefined
      rotary_embedding_kernel<scalar_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
                              ^

  kernels/rotary_embedding_kernel.cu(95): error: identifier "scalar_t" is undefined
      scalar_t* __restrict__ query,
      ^

  kernels/rotary_embedding_kernel.cu(96): error: identifier "scalar_t" is undefined
      scalar_t* __restrict__ key,
      ^

  kernels/rotary_embedding_kernel.cu(97): error: identifier "scalar_t" is undefined
      const scalar_t* __restrict__ cos_sin_cache,
            ^

  kernels/rotary_embedding_kernel.cu(104): error: identifier "scalar_t" is undefined
      rotary_embedding_kernel<scalar_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
                              ^

  8 errors detected in the compilation of "kernels/rotary_embedding_kernel.cu".
  thread 'main' panicked at '"nvcc" "--gpu-architecture=sm_86" "--ptx" "--use_fast_math" "-std=c++17" "-O" "2" "--default-stream" "per-thread" "--output-directory" "kernels/" "kernels/rotary_embedding_kernel.cu" failed with exit code exit status: 1', build.rs:65:13
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Let me know if further tests or variations could be useful, thanks!

EricLBuehler commented 7 months ago

This is an error related to the fact we need to manually monomorphize the kernel function. I have since pushed the changes, could you try it again?

ivanbaldo commented 7 months ago

Thanks! I just did git pull and run cargo install ... again and it failed with this errors now:

error: failed to run custom build command for `candle-vllm v0.1.0 (/candle-vllm)`

Caused by:
  process didn't exit successfully: `/candle-vllm/target/release/build/candle-vllm-2ea58adaa28146b1/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP
  cargo:rustc-env=CUDA_COMPUTE_CAP=86

  --- stderr
  kernels/rotary_embedding_kernel.cu(91): error: a __global__ function call must be configured
      rotary_embedding_kernel<uint8_t, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(105): error: a __global__ function call must be configured
      rotary_embedding_kernel<uint32_t, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(119): error: a __global__ function call must be configured
      rotary_embedding_kernel<int64_t, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(133): error: a __global__ function call must be configured
      rotary_embedding_kernel<float, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(147): error: a __global__ function call must be configured
      rotary_embedding_kernel<double, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(161): error: a __global__ function call must be configured
      rotary_embedding_kernel<int16_t, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(175): error: a __global__ function call must be configured
      rotary_embedding_kernel<int16_t, false>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(190): error: a __global__ function call must be configured
      rotary_embedding_kernel<uint8_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(204): error: a __global__ function call must be configured
      rotary_embedding_kernel<uint32_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(218): error: a __global__ function call must be configured
      rotary_embedding_kernel<int64_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(232): error: a __global__ function call must be configured
      rotary_embedding_kernel<float, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(246): error: a __global__ function call must be configured
      rotary_embedding_kernel<double, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(260): error: a __global__ function call must be configured
      rotary_embedding_kernel<int16_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  kernels/rotary_embedding_kernel.cu(274): error: a __global__ function call must be configured
      rotary_embedding_kernel<int16_t, true>(positions, query, key, cos_sin_cache, rot_dim, query_stride, key_stride, num_heads, num_kv_heads, head_size);
      ^

  14 errors detected in the compilation of "kernels/rotary_embedding_kernel.cu".
  thread 'main' panicked at '"nvcc" "--gpu-architecture=sm_86" "--ptx" "--use_fast_math" "-std=c++17" "-O" "2" "--default-stream" "per-thread" "--output-directory" "kernels/" "kernels/rotary_embedding_kernel.cu" failed with exit code exit status: 1', build.rs:65:13
EricLBuehler commented 7 months ago

Ok, I just pushed a change to hopefully fix that. Could you try it again?

ivanbaldo commented 7 months ago

Thanks, we got progress!, but later it failed with this:

   Compiling candle-vllm v0.1.0 (/candle-vllm)
error[E0433]: failed to resolve: use of undeclared type `Device`
  --> src/backend/layers.rs:20:9
   |
20 |     let Device::Cuda(dev) = positions_dev else {
   |         ^^^^^^ use of undeclared type `Device`
   |
help: consider importing this enum
   |
1  + use candle_core::Device;
   |

error[E0433]: failed to resolve: use of undeclared type `DType`
  --> src/backend/layers.rs:24:29
   |
24 |     if positions.dtype() != DType::I64 {
   |                             ^^^^^ use of undeclared type `DType`
   |
help: consider importing one of these items
   |
1  + use candle_core::DType;
   |
1  + use crate::backend::DType;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/backend/layers.rs:25:20
   |
25 |         return Err(APIError::new(format!(
   |                    ^^^^^^^^ use of undeclared type `APIError`
   |
help: consider importing this struct
   |
1  + use crate::openai::responses::APIError;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/backend/layers.rs:32:20
   |
32 |         return Err(APIError::new(format!(
   |                    ^^^^^^^^ use of undeclared type `APIError`
   |
help: consider importing this struct
   |
1  + use crate::openai::responses::APIError;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/backend/layers.rs:40:20
   |
40 |         return Err(APIError::new(format!(
   |                    ^^^^^^^^ use of undeclared type `APIError`
   |
help: consider importing this struct
   |
1  + use crate::openai::responses::APIError;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/backend/layers.rs:48:20
   |
48 |         return Err(APIError::new(format!(
   |                    ^^^^^^^^ use of undeclared type `APIError`
   |
help: consider importing this struct
   |
1  + use crate::openai::responses::APIError;
   |

error[E0422]: cannot find struct, variant or union type `LaunchConfig` in this scope
  --> src/backend/layers.rs:62:23
   |
62 |     let launch_conf = LaunchConfig {
   |                       ^^^^^^^^^^^^ not found in this scope
   |
help: consider importing this struct
   |
1  + use cudarc::driver::LaunchConfig;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/openai/responses.rs:38:28
   |
38 |                 return Err(APIError::from(e));
   |                            ^^^^^^^^ use of undeclared type `APIError`
   |
  ::: src/backend/layers.rs:77:18
   |
77 |     let stream = try_api!(dev.fork_default_stream());
   |                  ----------------------------------- in this macro invocation
   |
   = note: this error originates in the macro `try_api` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider importing this struct
  --> src/backend/layers.rs:1:1
   |
1  + use crate::openai::responses::APIError;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/openai/responses.rs:38:28
   |
38 |                   return Err(APIError::from(e));
   |                              ^^^^^^^^ use of undeclared type `APIError`
   |
  ::: src/backend/layers.rs:80:9
   |
80 | /         try_api!(get_or_load_func(
81 | |             ROTARY_EMBDEDDING_PTX,
82 | |             ROTARY_EMBDEDDING_KERNEL,
83 | |             query.dtype(),
84 | |             Some("_neox"),
85 | |             dev
86 | |         ))
   | |__________- in this macro invocation
   |
   = note: this error originates in the macro `try_api` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider importing this struct
  --> src/backend/layers.rs:1:1
   |
1  + use crate::openai::responses::APIError;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
  --> src/openai/responses.rs:38:28
   |
38 |                   return Err(APIError::from(e));
   |                              ^^^^^^^^ use of undeclared type `APIError`
   |
  ::: src/backend/layers.rs:88:9
   |
88 | /         try_api!(get_or_load_func(
89 | |             ROTARY_EMBDEDDING_PTX,
90 | |             ROTARY_EMBDEDDING_KERNEL,
91 | |             query.dtype(),
92 | |             None,
93 | |             dev
94 | |         ))
   | |__________- in this macro invocation
   |
   = note: this error originates in the macro `try_api` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider importing this struct
  --> src/backend/layers.rs:1:1
   |
1  + use crate::openai::responses::APIError;
   |

error[E0433]: failed to resolve: use of undeclared type `APIError`
   --> src/openai/responses.rs:38:28
    |
38  |                   return Err(APIError::from(e));
    |                              ^^^^^^^^ use of undeclared type `APIError`
    |
   ::: src/backend/layers.rs:97:5
    |
97  | /     try_api!(unsafe {
98  | |         kernel.launch_on_stream(
99  | |             &stream,
100 | |             launch_conf,
...   |
113 | |         )
114 | |     });
    | |______- in this macro invocation
    |
    = note: this error originates in the macro `try_api` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider importing this struct
   --> src/backend/layers.rs:1:1
    |
1   + use crate::openai::responses::APIError;
    |

warning: unused import: `either::Either`
  --> src/backend/cache.rs:10:5
   |
10 | use either::Either;
   |     ^^^^^^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

warning: unused imports: `bf16`, `f16`
  --> src/backend/cache.rs:11:12
   |
11 | use half::{bf16, f16};
   |            ^^^^  ^^^

warning: unused import: `either::Either`
 --> src/backend/layers.rs:2:5
  |
2 | use either::Either;
  |     ^^^^^^^^^^^^^^

error[E0308]: arguments to this function are incorrect
   --> src/backend/cache.rs:118:27
    |
118 |     let kernel = try_api!(get_or_load_func(
    |                           ^^^^^^^^^^^^^^^^
...
121 |         None,
    |         ---- expected `DType`, found `std::option::Option<_>`
122 |         key.dtype(),
    |         ----------- expected `std::option::Option<&str>`, found `DType`
    |
note: function defined here
   --> src/backend/mod.rs:17:8
    |
17  | pub fn get_or_load_func(
    |        ^^^^^^^^^^^^^^^^
18  |     ptx_file: &'static str,
    |     ----------------------
19  |     kernel_base: &str,
    |     -----------------
20  |     dtype: DType,
    |     ------------
21  |     suffix: Option<&str>,
    |     --------------------
22  |     device: &CudaDevice,
    |     -------------------
help: swap these arguments
    |
118 |     let kernel = try_api!(get_or_load_func(RESHAPE_AND_CACHE_PTX, RESHAPE_AND_CACHE_KERNEL, key.dtype(), None, dev));
    |                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

error[E0308]: arguments to this function are incorrect
   --> src/backend/cache.rs:250:27
    |
250 |     let kernel = try_api!(get_or_load_func(
    |                           ^^^^^^^^^^^^^^^^
...
253 |         None,
    |         ---- expected `DType`, found `std::option::Option<_>`
254 |         key_caches.first().unwrap().dtype(),
    |         ----------------------------------- expected `std::option::Option<&str>`, found `DType`
    |
note: function defined here
   --> src/backend/mod.rs:17:8
    |
17  | pub fn get_or_load_func(
    |        ^^^^^^^^^^^^^^^^
18  |     ptx_file: &'static str,
    |     ----------------------
19  |     kernel_base: &str,
    |     -----------------
20  |     dtype: DType,
    |     ------------
21  |     suffix: Option<&str>,
    |     --------------------
22  |     device: &CudaDevice,
    |     -------------------
help: swap these arguments
    |
250 |     let kernel = try_api!(get_or_load_func(COPY_BLOCKS_PTX, COPY_BLOCKS_KERNEL, key_caches.first().unwrap().dtype(), None, dev));
    |                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

error[E0308]: mismatched types
  --> src/backend/layers.rs:73:45
   |
73 |     let key_ptr = dispatch_get_cuda_pointer(key);
   |                   ------------------------- ^^^ expected `Tensor`, found `&mut Tensor`
   |                   |
   |                   arguments to this function are incorrect
   |
note: function defined here
  --> src/backend/mod.rs:70:4
   |
70 | fn dispatch_get_cuda_pointer(tensor: Tensor) -> u64 {
   |    ^^^^^^^^^^^^^^^^^^^^^^^^^ --------------

error[E0308]: mismatched types
  --> src/backend/layers.rs:74:47
   |
74 |     let query_ptr = dispatch_get_cuda_pointer(query);
   |                     ------------------------- ^^^^^ expected `Tensor`, found `&mut Tensor`
   |                     |
   |                     arguments to this function are incorrect
   |
note: function defined here
  --> src/backend/mod.rs:70:4
   |
70 | fn dispatch_get_cuda_pointer(tensor: Tensor) -> u64 {
   |    ^^^^^^^^^^^^^^^^^^^^^^^^^ --------------

error[E0599]: no method named `launch_on_stream` found for struct `CudaFunction` in the current scope
   --> src/backend/layers.rs:98:16
    |
98  |         kernel.launch_on_stream(
    |         -------^^^^^^^^^^^^^^^^ method not found in `CudaFunction`
    |
   ::: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.10.0/src/driver/safe/launch.rs:202:15
    |
202 |     unsafe fn launch_on_stream(
    |               ---------------- the method is available for `CudaFunction` here
    |
    = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
    |
1   + use candle_core::cuda_backend::cudarc::driver::LaunchAsync;
    |

error[E0308]: mismatched types
  --> src/backend/paged_attention.rs:15:6
   |
4  | pub fn paged_attention_v1(
   |        ------------------ implicitly returns `()` as its body has no tail or `return` expression
...
15 | ) -> Tensor {
   |      ^^^^^^ expected `Tensor`, found `()`

error[E0308]: mismatched types
  --> src/backend/mod.rs:25:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
25 |         Either::Left(DType::U8) => "_u8",
   |         ^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:26:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
25 |         Either::Left(DType::U8) => "_u8",
26 |         Either::Left(DType::U32) => "_u32",
   |         ^^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:27:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
...
27 |         Either::Left(DType::I64) => "_i64",
   |         ^^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:28:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
...
28 |         Either::Left(DType::BF16) => "_bf16",
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:29:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
...
29 |         Either::Left(DType::F16) => "_f16",
   |         ^^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:30:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
...
30 |         Either::Left(DType::F32) => "_f32",
   |         ^^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:31:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
...
31 |         Either::Left(DType::F64) => "_f64",
   |         ^^^^^^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0308]: mismatched types
  --> src/backend/mod.rs:32:9
   |
24 |     let mut spec = match dtype {
   |                          ----- this expression has type `DType`
...
32 |         Either::Right(data) => data,
   |         ^^^^^^^^^^^^^^^^^^^ expected `DType`, found `Either<_, _>`
   |
   = note: expected enum `DType`
              found enum `either::Either<_, _>`

error[E0369]: cannot add `&str` to `&str`
  --> src/backend/mod.rs:35:21
   |
35 |         spec = spec + suffix;
   |                ---- ^ ------ &str
   |                |    |
   |                |    `+` cannot be used to concatenate two `&str` strings
   |                &str
   |
   = note: string concatenation requires an owned `String` on the left
help: create an owned `String` from a string reference
   |
35 |         spec = spec.to_owned() + suffix;
   |                    +++++++++++

error[E0599]: no method named `device_ptr` found for reference `&CudaSlice<T>` in the current scope
  --> src/backend/mod.rs:84:84
   |
84 |         Storage::Cuda(cuda_storage) => *cuda_storage.as_cuda_slice::<T>().unwrap().device_ptr(),
   |                                                                                    ^^^^^^^^^^
   |
   = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
   |
1  + use candle_core::cuda_backend::cudarc::driver::DevicePtr;
   |
help: there is a method with a similar name
   |
84 |         Storage::Cuda(cuda_storage) => *cuda_storage.as_cuda_slice::<T>().unwrap().device(),
   |                                                                                    ~~~~~~

error[E0308]: mismatched types
  --> src/paged_attention/mod.rs:92:17
   |
88 |             paged_attention_v1(
   |             ------------------ arguments to this function are incorrect
...
92 |                 self.num_key_value_heads,
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^ expected `i32`, found `usize`
   |
note: function defined here
  --> src/backend/paged_attention.rs:4:8
   |
4  | pub fn paged_attention_v1(
   |        ^^^^^^^^^^^^^^^^^^
...
8  |     num_key_value_heads: i32, // [num_heads]
   |     ------------------------
help: you can convert a `usize` to an `i32` and panic if the converted value doesn't fit
   |
92 |                 self.num_key_value_heads.try_into().unwrap(),
   |                                         ++++++++++++++++++++

error[E0609]: no field `head_mapping` on type `&mut PagedAttention`
   --> src/paged_attention/mod.rs:117:22
    |
117 |                 self.head_mapping.clone(),
    |                      ^^^^^^^^^^^^ unknown field
    |
    = note: available fields are: `num_attention_heads`, `head_dim`, `num_key_value_heads`, `scale`, `sliding_window` ... and 2 others

warning: unused import: `CudaDType`
 --> src/backend/cache.rs:6:9
  |
6 |         CudaDType,
  |         ^^^^^^^^^

Some errors have detailed explanations: E0308, E0369, E0422, E0433, E0599, E0609.
For more information about an error, try `rustc --explain E0308`.
warning: `candle-vllm` (lib) generated 4 warnings
error: could not compile `candle-vllm` (lib) due to 29 previous errors; 4 warnings emitted
error: failed to compile `candle-vllm v0.1.0 (/candle-vllm)`, intermediate artifacts can be found at `/candle-vllm/target`
EricLBuehler commented 7 months ago

Ok, (again) I just pushed what is hopefully some fixes. Could you try it again?

ivanbaldo commented 7 months ago

Progress again! These are the errors now:

warning: unused import: `either::Either`
  --> src/backend/cache.rs:10:5
   |
10 | use either::Either;
   |     ^^^^^^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

warning: unused import: `either::Either`
  --> src/backend/mod.rs:98:5
   |
98 | use either::Either;
   |     ^^^^^^^^^^^^^^

error[E0308]: mismatched types
  --> src/backend/layers.rs:28:16
   |
21 |   ) {
   |     - help: a return type might be missing here: `-> _`
...
28 |           return Err(APIError::new(format!(
   |  ________________^
29 | |             "`positions` has {:?} type, expected I64 type.",
30 | |             positions.dtype()
31 | |         )));
   | |___________^ expected `()`, found `Result<_, APIError>`
   |
   = note: expected unit type `()`
                   found enum `Result<_, APIError>`

error[E0277]: the trait bound `&usize: DeviceRepr` is not satisfied
   --> src/backend/layers.rs:104:13
    |
101 |           kernel.launch_on_stream(
    |                  ---------------- required by a bound introduced by this call
...
104 | /             (
105 | |                 positions_ptr,
106 | |                 query_ptr,
107 | |                 key_ptr,
...   |
114 | |                 head_size,
115 | |             ),
    | |_____________^ the trait `DeviceRepr` is not implemented for `&usize`
    |
    = help: the trait `DeviceRepr` is implemented for `usize`
    = note: required for `CudaFunction` to implement `LaunchAsync<(u64, u64, u64, u64, &usize, &usize, &usize, usize, usize, usize)>`

warning: unused import: `CudaDType`
 --> src/backend/cache.rs:6:9
  |
6 |         CudaDType,
  |         ^^^^^^^^^

Some errors have detailed explanations: E0277, E0308.
For more information about an error, try `rustc --explain E0277`.
warning: `candle-vllm` (lib) generated 3 warnings
error: could not compile `candle-vllm` (lib) due to 2 previous errors; 3 warnings emitted
error: failed to compile `candle-vllm v0.1.0 (/candle-vllm)`, intermediate artifacts can be found at `/candle-vllm/target`
EricLBuehler commented 7 months ago

Ok, the commit I just pushed should fix that. Could you try it again?

ivanbaldo commented 7 months ago

Down to one error now!!! Good job!!!

error[E0614]: type `usize` cannot be dereferenced
   --> src/backend/layers.rs:114:17
    |
114 |                 *head_size,
    |                 ^^^^^^^^^^

For more information about this error, try `rustc --explain E0614`.
EricLBuehler commented 7 months ago

Thanks, just pushed one more that should iron that out.

ivanbaldo commented 7 months ago

Thks, new set of errors:

   Compiling candle-vllm v0.1.0 (/candle-vllm)
warning: unused variable: `src_dev`
   --> src/backend/cache.rs:313:23
    |
313 |         (Device::Cuda(src_dev), Device::Cpu) => {
    |                       ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_src_dev`
    |
    = note: `#[warn(unused_variables)]` on by default

error[E0505]: cannot move out of `positions` because it is borrowed
  --> src/backend/layers.rs:75:51
   |
15 |     positions: Tensor,
   |     --------- binding `positions` declared here
...
22 |     let positions_dev = positions.device();
   |                         ------------------ borrow of `positions` occurs here
...
75 |     let positions_ptr = dispatch_get_cuda_pointer(positions);
   |                                                   ^^^^^^^^^ move out of `positions` occurs here
...
80 |     let stream = try_api!(dev.fork_default_stream());
   |                           ------------------------- borrow later used here

error[E0505]: cannot move out of `cos_sin_cache` because it is borrowed
   --> src/backend/layers.rs:78:55
    |
19  |     cos_sin_cache: Tensor,
    |     ------------- binding `cos_sin_cache` declared here
...
59  |     let rot_dim = cos_sin_cache.shape().dims().get(1).unwrap();
    |                   --------------------- borrow of `cos_sin_cache` occurs here
...
78  |     let cos_sin_cache_ptr = dispatch_get_cuda_pointer(cos_sin_cache);
    |                                                       ^^^^^^^^^^^^^ move out of `cos_sin_cache` occurs here
...
109 |                 *rot_dim,
    |                 -------- borrow later used here

warning: variable does not need to be mutable
  --> src/backend/mod.rs:24:9
   |
24 |     let mut spec = match dtype {
   |         ----^^^^
   |         |
   |         help: remove this `mut`
   |
   = note: `#[warn(unused_mut)]` on by default

For more information about this error, try `rustc --explain E0505`.
warning: `candle-vllm` (lib) generated 2 warnings
error: could not compile `candle-vllm` (lib) due to 2 previous errors; 2 warnings emitted
EricLBuehler commented 7 months ago

Ok, could you try it again?

ivanbaldo commented 7 months ago

Progress! Now these:

warning: unused variable: `src_dev`
   --> src/backend/cache.rs:313:23
    |
313 |         (Device::Cuda(src_dev), Device::Cpu) => {
    |                       ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_src_dev`
    |
    = note: `#[warn(unused_variables)]` on by default

error[E0716]: temporary value dropped while borrowed
  --> src/backend/layers.rs:59:19
   |
59 |     let rot_dim = cos_sin_cache.shape().clone().dims().get(1).unwrap();
   |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                       - temporary value is freed at the end of this statement
   |                   |
   |                   creates a temporary value which is freed while still in use
...
68 |             512.min((num_heads * rot_dim / 2).try_into().unwrap()),
   |                                  ------- borrow later used here
   |
help: consider using a `let` binding to create a longer lived value
   |
59 ~     let binding = cos_sin_cache.shape().clone();
60 ~     let rot_dim = binding.dims().get(1).unwrap();
   |

For more information about this error, try `rustc --explain E0716`.
warning: `candle-vllm` (lib) generated 1 warning
error: could not compile `candle-vllm` (lib) due to previous error; 1 warning emitted
EricLBuehler commented 7 months ago

Ok, could you try it again?

ivanbaldo commented 7 months ago

Now that part compiled with some warnings, but it compiled. But now the linking issue with candle-flash-attn reappeared. To recap: this is based on nvidia/cuda:12.3.1-devel-rockylinux9 with dnf upgrade and cargo update. These are the warnings:

   Compiling candle-vllm v0.1.0 (/candle-vllm)
warning: unused variable: `src_dev`
   --> src/backend/cache.rs:313:23
    |
313 |         (Device::Cuda(src_dev), Device::Cpu) => {
    |                       ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_src_dev`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: unused `Result` that must be used
   --> src/openai/models/llama.rs:196:9
    |
196 | /         rotary_embedding(
197 | |             positions,
198 | |             q,
199 | |             k,
...   |
202 | |             false,
203 | |         );
    | |_________^
    |
    = note: this `Result` may be an `Err` variant, which should be handled
    = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
    |
196 |         let _ = rotary_embedding(
    |         +++++++

warning: `candle-vllm` (lib) generated 2 warnings (run `cargo fix --lib -p candle-vllm` to apply 1 suggestion)

And the linking errors:

error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "cc" "-m64" "/tmp/rustc2owX73/symbols.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.0.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.1.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.10.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.11.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.12.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.13.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.14.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.15.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.2.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.3.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.4.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.5.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.6.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.7.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.8.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.candle_vllm.62ba636aee07a975-cgu.9.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d.1kk2vorh9x18px06.rcgu.o" "-Wl,--as-needed" "-L" "/candle-vllm/target/release/deps" "-L" "/candle-vllm/target/release/build/zstd-sys-dbf9b1574083ed21/out" "-L" "/usr/local/cuda/lib64" "-L" "/usr/local/cuda/lib64/stubs" "-L" "/usr/local/cuda/targets/x86_64-linux" "-L" "/usr/local/cuda/targets/x86_64-linux/lib" "-L" "/usr/local/cuda/targets/x86_64-linux/lib/stubs" "-L" "/usr/lib" "-L" "/usr/lib64" "-L" "/candle-vllm/target/release/build/ring-0546e065e7062d53/out" "-L" "/candle-vllm/target/release/build/esaxx-rs-be1982d2e341d29d/out" "-L" "/candle-vllm/target/release/build/onig_sys-4ce51a84b783f95f/out" "-L" "/candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out" "-L" "/usr/local/cuda/lib64" "-L" "/usr/local/cuda/lib64/stubs" "-L" "/usr/local/cuda/targets/x86_64-linux" "-L" "/usr/local/cuda/targets/x86_64-linux/lib" "-L" "/usr/local/cuda/targets/x86_64-linux/lib/stubs" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/candle-vllm/target/release/deps/libenv_logger-2d0b3827bd002e31.rlib" "/candle-vllm/target/release/deps/libtermcolor-7fd0e5721205d087.rlib" "/candle-vllm/target/release/deps/libis_terminal-89ef26f9c10f9212.rlib" "/candle-vllm/target/release/deps/librustix-d43875b6cd8dec3a.rlib" "/candle-vllm/target/release/deps/liblinux_raw_sys-872daf025077fc20.rlib" "/candle-vllm/target/release/deps/libhumantime-1dc284c82c7f0559.rlib" "/candle-vllm/target/release/deps/libcandle_vllm-96f5dc2ed1fb7068.rlib" "/candle-vllm/target/release/deps/libchrono-3cac20379a81063e.rlib" "/candle-vllm/target/release/deps/libiana_time_zone-41f802048b72bf7b.rlib" "/candle-vllm/target/release/deps/libhf_hub-67a984a3bd095729.rlib" "/candle-vllm/target/release/deps/libdirs-9b2222ab35d3810d.rlib" "/candle-vllm/target/release/deps/libdirs_sys-093500c3463c7fa8.rlib" "/candle-vllm/target/release/deps/liboption_ext-3db96de540040126.rlib" "/candle-vllm/target/release/deps/libureq-a4b308eb4114e2ad.rlib" "/candle-vllm/target/release/deps/libwebpki_roots-ea5e9bb57328911f.rlib" "/candle-vllm/target/release/deps/librustls-bc255bbda3c5502d.rlib" "/candle-vllm/target/release/deps/libsubtle-910e19b9d08b2799.rlib" "/candle-vllm/target/release/deps/libwebpki-130c21a29ac4d059.rlib" "/candle-vllm/target/release/deps/libring-810800fbfa7b1a3d.rlib" "/candle-vllm/target/release/deps/libspin-a5bca8ced7fc453c.rlib" "/candle-vllm/target/release/deps/libuntrusted-766afbb3ef44c1d1.rlib" "/candle-vllm/target/release/deps/libzeroize-77a7da17bcae1046.rlib" "/candle-vllm/target/release/deps/librustls_pki_types-c6df937ba04d34cb.rlib" "/candle-vllm/target/release/deps/libhootbin-5825222cc042ea1c.rlib" "/candle-vllm/target/release/deps/libfastrand-fd92473d790916bb.rlib" "/candle-vllm/target/release/deps/libhoot-7a0d75c1644cd85e.rlib" "/candle-vllm/target/release/deps/libreqwest-418640b55678fdff.rlib" "/candle-vllm/target/release/deps/librustls_pemfile-eb847ff6658d990b.rlib" "/candle-vllm/target/release/deps/libhyper_tls-761834c86e0dfb0e.rlib" "/candle-vllm/target/release/deps/libipnet-ff268618cdf09b54.rlib" "/candle-vllm/target/release/deps/libtokio_native_tls-6a98809186b7d24f.rlib" "/candle-vllm/target/release/deps/libnative_tls-29f3830c65a1dd32.rlib" "/candle-vllm/target/release/deps/libopenssl_probe-e135bf478bd9e62b.rlib" "/candle-vllm/target/release/deps/libopenssl-b99171d686a87d4b.rlib" "/candle-vllm/target/release/deps/libforeign_types-434e4620cdd2963d.rlib" "/candle-vllm/target/release/deps/libforeign_types_shared-3cd91dddd8b3059a.rlib" "/candle-vllm/target/release/deps/libopenssl_sys-6b731b85090325a6.rlib" "/candle-vllm/target/release/deps/libhyper-47e527bc8bf78df1.rlib" "/candle-vllm/target/release/deps/libwant-199118bdcd481b61.rlib" "/candle-vllm/target/release/deps/libtry_lock-4868dbe5a9f104c8.rlib" "/candle-vllm/target/release/deps/libtower_service-7ef5e09e7ca0f321.rlib" "/candle-vllm/target/release/deps/libsync_wrapper-60d7b39caee1bf33.rlib" "/candle-vllm/target/release/deps/libhttp_body-56cdab78f4db29df.rlib" "/candle-vllm/target/release/deps/libcandle_lora_transformers-66ecfa37e9a12967.rlib" "/candle-vllm/target/release/deps/libtqdm-2d95da43c8637887.rlib" "/candle-vllm/target/release/deps/libcrossterm-2fca374d20f41b01.rlib" "/candle-vllm/target/release/deps/libsignal_hook_mio-e98d1a7ffed565b4.rlib" "/candle-vllm/target/release/deps/libsignal_hook-2e1b132684e7f9b1.rlib" "/candle-vllm/target/release/deps/libanyhow-82509780fe55d5b3.rlib" "/candle-vllm/target/release/deps/libcandle_lora-989120b3272dc83a.rlib" "/candle-vllm/target/release/deps/libtrc-af4d2dc9e955d45c.rlib" "/candle-vllm/target/release/deps/libuuid-4eaa2543f30057bf.rlib" "/candle-vllm/target/release/deps/libcandle_transformers-b19346912b1b1838.rlib" "/candle-vllm/target/release/deps/libserde_plain-1da78c8ca7ee3ec7.rlib" "/candle-vllm/target/release/deps/libcandle_flash_attn-2cb3752fec37905c.rlib" "/candle-vllm/target/release/deps/libdyn_fmt-ca01837b2f65b0b1.rlib" "/candle-vllm/target/release/deps/libfutures-1f136c83b52e06cd.rlib" "/candle-vllm/target/release/deps/libfutures_executor-a3d9383a156f9f10.rlib" "/candle-vllm/target/release/deps/libcandle_sampling-15eeca2267c8d9d8.rlib" "/candle-vllm/target/release/deps/libcandle_nn-31afb7102c013c81.rlib" "/candle-vllm/target/release/deps/libtokenizers-e4ef767b52132dd9.rlib" "/candle-vllm/target/release/deps/libesaxx_rs-734a5148ef5579eb.rlib" "/candle-vllm/target/release/deps/libunicode_normalization_alignments-db25bac7de079d3d.rlib" "/candle-vllm/target/release/deps/libspm_precompiled-987f4e1f2e6e5cbb.rlib" "/candle-vllm/target/release/deps/libbase64-a00060132962802d.rlib" "/candle-vllm/target/release/deps/libunicode_segmentation-0609f6ce0b27032d.rlib" "/candle-vllm/target/release/deps/libnom-3d1816c8c91e268a.rlib" "/candle-vllm/target/release/deps/libunicode_categories-4b2d8309eb580595.rlib" "/candle-vllm/target/release/deps/libmonostate-15263a542f4a6e1c.rlib" "/candle-vllm/target/release/deps/libmacro_rules_attribute-fbe2172e90fd6d9d.rlib" "/candle-vllm/target/release/deps/libindicatif-e51bcf09d533796f.rlib" "/candle-vllm/target/release/deps/libportable_atomic-37fa7d733d3c2283.rlib" "/candle-vllm/target/release/deps/libnumber_prefix-fcbd61cd7f0fb674.rlib" "/candle-vllm/target/release/deps/libconsole-d644b00c632b508a.rlib" "/candle-vllm/target/release/deps/libunicode_width-4a01194dbfae8c91.rlib" "/candle-vllm/target/release/deps/librayon_cond-84d2d6a9f989ac86.rlib" "/candle-vllm/target/release/deps/libitertools-87b264833edf6f52.rlib" "/candle-vllm/target/release/deps/libonig-9d7270d7c8fd85bc.rlib" "/candle-vllm/target/release/deps/libonig_sys-b2ea9dea10b0294e.rlib" "/candle-vllm/target/release/deps/libderive_builder-7eba651d6dc3f2d2.rlib" "/candle-vllm/target/release/deps/liblazy_static-852800890c81fb22.rlib" "/candle-vllm/target/release/deps/libclap-6cce81f5b0c21c4d.rlib" "/candle-vllm/target/release/deps/libclap_builder-c0b7d4c619b20786.rlib" "/candle-vllm/target/release/deps/libstrsim-bfb3799e9677cd4d.rlib" "/candle-vllm/target/release/deps/libanstream-2345d25369a0c766.rlib" "/candle-vllm/target/release/deps/libanstyle_query-d08e7c102e46eb49.rlib" "/candle-vllm/target/release/deps/libcolorchoice-d9fe16d50a3dd803.rlib" "/candle-vllm/target/release/deps/libanstyle_parse-6ac7d6e179081361.rlib" "/candle-vllm/target/release/deps/libutf8parse-86e737e0d4678582.rlib" "/candle-vllm/target/release/deps/libclap_lex-3a6b7689365ae37a.rlib" "/candle-vllm/target/release/deps/libanstyle-eb2ffb42ebf589fd.rlib" "/candle-vllm/target/release/deps/libcandle_core-bd61d77eb1719017.rlib" "/candle-vllm/target/release/deps/libmemmap2-212b860c29c3b1bd.rlib" "/candle-vllm/target/release/deps/libzip-9f9cf8564fd57087.rlib" "/candle-vllm/target/release/deps/libyoke-0edeeb516196a696.rlib" "/candle-vllm/target/release/deps/libzerofrom-635514bc19b31e05.rlib" "/candle-vllm/target/release/deps/libstable_deref_trait-76725faa25d9c59b.rlib" "/candle-vllm/target/release/deps/libthiserror-a7014e8beba5c405.rlib" "/candle-vllm/target/release/deps/libsafetensors-1dc2485f251fe6a8.rlib" "/candle-vllm/target/release/deps/libcudarc-15ad263f438ac593.rlib" "/candle-vllm/target/release/deps/libcandle_kernels-858c2d0e13ad4d14.rlib" "/candle-vllm/target/release/deps/libgemm-0df17278b2df1e9d.rlib" "/candle-vllm/target/release/deps/libgemm_c32-1a5f8b19e06a8c97.rlib" "/candle-vllm/target/release/deps/libgemm_c64-76bce200eaecdee4.rlib" "/candle-vllm/target/release/deps/libgemm_f64-3224a33d6d916809.rlib" "/candle-vllm/target/release/deps/libgemm_f16-d938d2b4b3fed4e5.rlib" "/candle-vllm/target/release/deps/libgemm_f32-5c254676f792b840.rlib" "/candle-vllm/target/release/deps/libgemm_common-d38692569c7f4e1a.rlib" "/candle-vllm/target/release/deps/libpulp-1ce4cd9fe89db9ff.rlib" "/candle-vllm/target/release/deps/libnum_complex-3fed0e2e4f9e5202.rlib" "/candle-vllm/target/release/deps/libdyn_stack-9b24260c69b5272e.rlib" "/candle-vllm/target/release/deps/libreborrow-77659d577c4b718c.rlib" "/candle-vllm/target/release/deps/libraw_cpuid-b9cfe85e371d3083.rlib" "/candle-vllm/target/release/deps/libbitflags-b9815c55ec510696.rlib" "/candle-vllm/target/release/deps/librayon-3c309fded7dea17d.rlib" "/candle-vllm/target/release/deps/librayon_core-1c9c2cb057344777.rlib" "/candle-vllm/target/release/deps/libcrossbeam_deque-61f81dc6e7e011b4.rlib" "/candle-vllm/target/release/deps/libcrossbeam_epoch-5d4631034dbce19f.rlib" "/candle-vllm/target/release/deps/libcrossbeam_utils-3c947cc337c38520.rlib" "/candle-vllm/target/release/deps/libeither-c016b57e73ba30c1.rlib" "/candle-vllm/target/release/deps/libbyteorder-8bf78fc69cf5b0a1.rlib" "/candle-vllm/target/release/deps/libhalf-b518e6f9c7338ab7.rlib" "/candle-vllm/target/release/deps/librand_distr-cccb1699d7cd40ff.rlib" "/candle-vllm/target/release/deps/libnum_traits-28ee9b33f1e53f29.rlib" "/candle-vllm/target/release/deps/libbytemuck-61cb00a9722bf6f9.rlib" "/candle-vllm/target/release/deps/libactix_web-d5c2e016c0a76be2.rlib" "/candle-vllm/target/release/deps/liburl-2815a4daf7f9adfd.rlib" "/candle-vllm/target/release/deps/libidna-c440fe7285f1b3e7.rlib" "/candle-vllm/target/release/deps/libunicode_normalization-8459a56260cd69f0.rlib" "/candle-vllm/target/release/deps/libtinyvec-61debd23e06e16bf.rlib" "/candle-vllm/target/release/deps/libtinyvec_macros-f326b6a6f0ca8a7b.rlib" "/candle-vllm/target/release/deps/libunicode_bidi-f32ebb17f2e11b02.rlib" "/candle-vllm/target/release/deps/libserde_urlencoded-487feade6ed66a8a.rlib" "/candle-vllm/target/release/deps/libform_urlencoded-3e169fc285508f2a.rlib" "/candle-vllm/target/release/deps/libserde_json-af7b0a1679d453f4.rlib" "/candle-vllm/target/release/deps/libryu-8b05c69dcf279a6f.rlib" "/candle-vllm/target/release/deps/libactix_server-094bd922ee0ce4c2.rlib" "/candle-vllm/target/release/deps/libactix_router-328861445c4e5b10.rlib" "/candle-vllm/target/release/deps/libregex-e23744b59681cf08.rlib" "/candle-vllm/target/release/deps/libregex_automata-467c1d200a330751.rlib" "/candle-vllm/target/release/deps/libaho_corasick-2ffc2abbd0b517eb.rlib" "/candle-vllm/target/release/deps/libregex_syntax-3dd804a409b2c545.rlib" "/candle-vllm/target/release/deps/libserde-9901539636462a7e.rlib" "/candle-vllm/target/release/deps/libcookie-7a03bc95754d078c.rlib" "/candle-vllm/target/release/deps/libtime-00420198b7d8f3d9.rlib" "/candle-vllm/target/release/deps/libtime_core-531fb2a2b6009484.rlib" "/candle-vllm/target/release/deps/libnum_conv-27cab79cc649b5eb.rlib" "/candle-vllm/target/release/deps/libderanged-93327753f562d3b3.rlib" "/candle-vllm/target/release/deps/libpowerfmt-c4543fc1903272c6.rlib" "/candle-vllm/target/release/deps/libactix_http-49df50c41746d7cb.rlib" "/candle-vllm/target/release/deps/librand-9bab0c1e1cd6b8a5.rlib" "/candle-vllm/target/release/deps/librand_chacha-85c9c49588ab6f36.rlib" "/candle-vllm/target/release/deps/libppv_lite86-9a645f708eed4e1c.rlib" "/candle-vllm/target/release/deps/librand_core-9fc4ad8dc509a141.rlib" "/candle-vllm/target/release/deps/libhttparse-699e93ce2c2e7905.rlib" "/candle-vllm/target/release/deps/libbrotli-df4299509820f939.rlib" "/candle-vllm/target/release/deps/libbrotli_decompressor-0212e4cdb0da1245.rlib" "/candle-vllm/target/release/deps/liballoc_stdlib-fc777d5f3c59a235.rlib" "/candle-vllm/target/release/deps/liballoc_no_stdlib-f497a54db348ea9b.rlib" "/candle-vllm/target/release/deps/libhttpdate-5f8e81ac577420b0.rlib" "/candle-vllm/target/release/deps/libsha1-15062a787796e890.rlib" "/candle-vllm/target/release/deps/libcpufeatures-331cc3717db65aac.rlib" "/candle-vllm/target/release/deps/libdigest-5f088c274186e012.rlib" "/candle-vllm/target/release/deps/libblock_buffer-2ad0dde06bca4c37.rlib" "/candle-vllm/target/release/deps/libcrypto_common-30c46997c474a2db.rlib" "/candle-vllm/target/release/deps/libgeneric_array-95ff38f8e6dc2014.rlib" "/candle-vllm/target/release/deps/libtypenum-ddf8574aa94ffabe.rlib" "/candle-vllm/target/release/deps/libbase64-ecd99bd23d0ff318.rlib" "/candle-vllm/target/release/deps/liblocal_channel-d8bac00ef5bd5826.rlib" "/candle-vllm/target/release/deps/libbytestring-4d1e0f611bab987e.rlib" "/candle-vllm/target/release/deps/libencoding_rs-c048082deb3a71c3.rlib" "/candle-vllm/target/release/deps/liblanguage_tags-e0dfc52f86f9b27a.rlib" "/candle-vllm/target/release/deps/libahash-a21dc39883e7ac23.rlib" "/candle-vllm/target/release/deps/libgetrandom-285e23a7efc98d13.rlib" "/candle-vllm/target/release/deps/libzerocopy-81a7c0f066e0c7c2.rlib" "/candle-vllm/target/release/deps/libmime-04e6f00618993e67.rlib" "/candle-vllm/target/release/deps/libpercent_encoding-d54414372a2980de.rlib" "/candle-vllm/target/release/deps/libh2-7f6e7d82bfd1eac1.rlib" "/candle-vllm/target/release/deps/libindexmap-3cfe35f40e644070.rlib" "/candle-vllm/target/release/deps/libequivalent-8a25e166243cfe94.rlib" "/candle-vllm/target/release/deps/libhashbrown-aee95c0614bccf63.rlib" "/candle-vllm/target/release/deps/libfutures_util-b7a602635d036bf0.rlib" "/candle-vllm/target/release/deps/libfutures_io-bdf6e194ea9577ee.rlib" "/candle-vllm/target/release/deps/libslab-490ef311b9a84e0e.rlib" "/candle-vllm/target/release/deps/libfutures_channel-5ea32066cac56ffd.rlib" "/candle-vllm/target/release/deps/libfutures_task-a2c77643b6b905dd.rlib" "/candle-vllm/target/release/deps/libpin_utils-185c55cbe9ca2fff.rlib" "/candle-vllm/target/release/deps/libzstd-36f5f9d9d2e3508b.rlib" "/candle-vllm/target/release/deps/libzstd_safe-aaa592be23ed88f7.rlib" "/candle-vllm/target/release/deps/libzstd_sys-4496f40e68c091c4.rlib" "/candle-vllm/target/release/deps/libflate2-5502f6fe44f51b4c.rlib" "/candle-vllm/target/release/deps/libminiz_oxide-de7ef7b63a4f7412.rlib" "/candle-vllm/target/release/deps/libsimd_adler32-d1dbd8e6b06bf162.rlib" "/candle-vllm/target/release/deps/libcrc32fast-ceb628e76fc0bab0.rlib" "/candle-vllm/target/release/deps/libactix_service-70cd0075f0fbfc66.rlib" "/candle-vllm/target/release/deps/libactix_codec-9e65a5e0b8e81eb5.rlib" "/candle-vllm/target/release/deps/libmemchr-842ac33dededf7d9.rlib" "/candle-vllm/target/release/deps/libbitflags-8cf9aca8dca9dec7.rlib" "/candle-vllm/target/release/deps/libtokio_util-b89e8c0cc220b5ed.rlib" "/candle-vllm/target/release/deps/libtracing-0c92736ef86fff4c.rlib" "/candle-vllm/target/release/deps/libtracing_core-8d095aff5d6b2dc5.rlib" "/candle-vllm/target/release/deps/libonce_cell-06dfa01c968b8e7e.rlib" "/candle-vllm/target/release/deps/libfutures_sink-8d94a6b44313bd03.rlib" "/candle-vllm/target/release/deps/libactix_utils-ec862be5af373362.rlib" "/candle-vllm/target/release/deps/liblocal_waker-7857496d2dec9a57.rlib" "/candle-vllm/target/release/deps/libactix_rt-7d0004af1d35aa47.rlib" "/candle-vllm/target/release/deps/libtokio-e1ade8da5b909c98.rlib" "/candle-vllm/target/release/deps/libsignal_hook_registry-dcf70e2b6c44755c.rlib" "/candle-vllm/target/release/deps/libnum_cpus-ed5feb6abe3397ba.rlib" "/candle-vllm/target/release/deps/libsocket2-1ff22b428921d589.rlib" "/candle-vllm/target/release/deps/libmio-afa56dff974d55e1.rlib" "/candle-vllm/target/release/deps/liblog-35f97248cb2ec82c.rlib" "/candle-vllm/target/release/deps/libparking_lot-1fe2bfc24acd589d.rlib" "/candle-vllm/target/release/deps/libparking_lot_core-563decbd4c1f2891.rlib" "/candle-vllm/target/release/deps/liblibc-b4a93e966581df64.rlib" "/candle-vllm/target/release/deps/libcfg_if-88c619515d65e3f1.rlib" "/candle-vllm/target/release/deps/libsmallvec-65a0bed430993cf2.rlib" "/candle-vllm/target/release/deps/liblock_api-920512de5989abb2.rlib" "/candle-vllm/target/release/deps/libscopeguard-6208b4062bcdc2b1.rlib" "/candle-vllm/target/release/deps/libpin_project_lite-42a553ee08f02ebb.rlib" "/candle-vllm/target/release/deps/libfutures_core-891f46f0aceca63c.rlib" "/candle-vllm/target/release/deps/libhttp-b738399ec4ab1c60.rlib" "/candle-vllm/target/release/deps/libitoa-dcbca83b54db3306.rlib" "/candle-vllm/target/release/deps/libbytes-8c2bf1b211f72910.rlib" "/candle-vllm/target/release/deps/libfnv-ffe196e20ea2a648.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-9c342d6596ca77d8.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-35e6faa0abf08dd1.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-6242b5524a2684de.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-94511439d510df36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-1923a594ddedab24.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-5b476927cd520d76.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-6b4664d28b4dc07b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-4d7e14ee42b44abc.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-94e04d08d317eb2b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-7e3a1db27b23a8ee.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-0651af3c34a1e4b9.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-e5da8ecb95d2de36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-052b86aa844a2857.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-bbd2a157557b773d.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-f47279717d0e1831.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-d30e243a979711ec.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-18929aabe36e3f57.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-f9f41fbdedfbfafb.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-b26982894e484f03.rlib" "-Wl,-Bdynamic" "-lssl" "-lcrypto" "-lflashattention" "-lcudart" "-lstdc++" "-lstdc++" "-lcuda" "-lnvrtc" "-lcurand" "-lcublas" "-lcublasLt" "-lcudnn" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/candle-vllm/target/release/deps/candle_vllm-001dc109ba8da34d" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-nodefaultlibs"
  = note: /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_api.o): relocation R_X86_64_32 against `.nvFatBinSegment' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim128_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi32ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi128ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim160_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi128ELi32ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi160ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim192_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim224_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim256_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi128ELi64ELi8ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi256ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim32_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim64_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim96_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim128_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi32ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi128ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim160_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi128ELi32ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi160ELi128ELi32ELi4ES2_EELb1ELb0ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim192_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim224_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim256_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi128ELi64ELi8ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi256ELi128ELi64ELi8ES2_EELb1ELb1ELb0ELb1ELb0ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim32_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim64_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-9fc2dbcb21177bee/out/libflashattention.a(flash_fwd_hdim96_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb1ELb0ELb0ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          collect2: error: ld returned 1 exit status

error: could not compile `candle-vllm` (bin "candle-vllm") due to previous error
error: failed to compile `candle-vllm v0.1.0 (/candle-vllm)`, intermediate artifacts can be found at `/candle-vllm/target`

I will restart from scratch but this time I will download the latest version of Rust instead of using the Rocky 9 one, to see if it succeeds that way.

ivanbaldo commented 7 months ago

Using the latest Rust failed too at the linking phase :-/.

EricLBuehler commented 7 months ago

I was able to build successfully on a fresh instance, by executing:

git clone https://github.com/EricLBuehler/candle-vllm.git
cd candle-vllm
sudo apt install libssl-dev
sudo apt install pkg-config
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"
cargo build

Of course, execution will not work, as the kernels for Paged Attention are in the process of porting. However, cargo build works. Does this help?

ivanbaldo commented 7 months ago

Yep it's good to know that in some cases it compiles. I am trying currently with nvidia/cuda:12.2.2-cudnn8-devel-rockylinux9 as base instead of installing libcudnn8-devel since googling around it seems that for static linking the versions of CUDA with which CUDNN has been compiled must match, and Rust maybe it's compiling statically. If this doesn't work then I will switch to an Ubuntu image and retry. Thanks for all your help!

ivanbaldo commented 7 months ago

Ouch, that didn't work, I will try with Ubuntu tomorrow.

ivanbaldo commented 7 months ago

Success!!! With nvidia/cuda:12.2.2-cudnn8-devel-ubuntu22.04 as base it compiles correctly. So the linking error seems to be with Red Hat based distros. It may or may not happen with Ubuntu 24.04 when that gets released, will be interesting to test it at that time. Maybe we should leave this issue opened and see if it gets resolved in the future? At this time I guess that it's more important to continue adding features and polishing it instead of investing time on obscure linking problems.

EricLBuehler commented 7 months ago

I agree. It may have to do with the CUDA driver, though.

Ranganaths commented 6 months ago

Is this issue resolved.. ?? looks like i am hitting at same issue.. on ubuntu 23.10 rrors detected in the compilation of "kernels/flash_fwd_hdim256_bf16_sm80.cu".

--error 0x1 --

thread '' panicked bindgen_cuda-0.1.4/src/lib.rs:262:21: nvcc error while executing compiling: "nvcc" "--gpu-architecture=sm_75" "-c" "-o" "/home/tantrajya/Documents/rustprojects/candle-vllm-master/target/release/build/candle-flash-attn-f1535da02b90d1cc/out/flash_fwd_hdim256_bf16_sm80-21dd0f7dd998e506.o" "--default-stream" "per-thread" "-std=c++17" "-O3" "-UCUDA_NO_HALF_OPERATORS" "-UCUDA_NO_HALF_CONVERSIONS" "-UCUDA_NO_HALF2_OPERATORS" "-UCUDA_NO_BFLOAT16_CONVERSIONS" "-Icutlass/include" "--expt-relaxed-constexpr" "--expt-extended-lambda" "--use_fast_math" "--verbose" "kernels/flash_fwd_hdim256_bf16_sm80.cu"

EricLBuehler commented 6 months ago

From what I understand, this issue is not resolved, as it is likely part of Candle. Could you please open an issue on Candle? candle-vllm does not build flash attention kernels, and this build step is a part of Candle's build.rs.

ivanbaldo commented 6 months ago

Hi! Sorry for the late reply. Tested it and you are right, reported here: https://github.com/huggingface/candle/issues/1844 Thanks!!!

EricLBuehler commented 6 months ago

Thank you! Please see mistral.rs it is the successor to this project which supports flash attention and GGUF, etc.

On Wed, Mar 13, 2024, 3:13 PM Iván Baldo @.***> wrote:

Hi! Sorry for the late reply. Tested it and you are right, reported here: huggingface/candle#1844 https://github.com/huggingface/candle/issues/1844 Thanks!!!

— Reply to this email directly, view it on GitHub https://github.com/EricLBuehler/candle-vllm/issues/25#issuecomment-1995446875, or unsubscribe https://github.com/notifications/unsubscribe-auth/APRFUWZT6NQAS2HTMGULY3LYYCQM3AVCNFSM6AAAAABCT3LJSGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJVGQ2DMOBXGU . You are receiving this because you were assigned.Message ID: @.***>