namhyung / uftrace

Function graph tracer for C/C++/Rust/Python
https://uftrace.github.io/slide/
GNU General Public License v2.0
3.03k stars 444 forks source link

weird warning when running uftrace record for chromadb python program #1904

Open honggyukim opened 6 months ago

honggyukim commented 6 months ago

uftrace record for chromadb example shows an weird warning. Let's say there is an example as follows.

$ cat test-chromadb.py
#!/usr/bin/env python3

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

It shows nothing when running without uftrace.

$ ./test-chromadb.py

But recording it shows a warning as follows.

$ uftrace record ./test-chromadb.py
2024-03-14 20:44:55.014837669 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.

This warning must be removed as if test-chromadb.py is executed without uftrace.

namhyung commented 6 months ago

Hmm.. this would be hard to debug. Can you find some more hints?

honggyukim commented 6 months ago

I don't have more clues yet, but just posted it how to reproduce the issue. Isn't it possible to reproduce it in your environment?

yihong0618 commented 6 months ago

I did some search...

reason:

honggyukim commented 6 months ago

Hi @yihong0618, thanks for your investigation. I can simply reproduce with a much simpler example as follows.

$ cat onnx-test.py
#!/usr/bin/env python3
import onnxruntime

$ uftrace record ./onnx-test.py
2024-03-18 19:26:39.843928466 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
yihong0618 commented 6 months ago

Hi @yihong0618, thanks for your investigation. I can simply reproduce with a much simpler example as follows.

$ cat onnx-test.py
#!/usr/bin/env python3
import onnxruntime

$ uftrace record ./onnx-test.py
2024-03-18 19:26:39.843928466 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.

yes but I did not find the root cause in uftrace side

honggyukim commented 6 months ago

It's okay. I'm just leaving a note for later investigation. I will have a look when I have more time for this.

Thanks very much for your help again!

honggyukim commented 6 months ago

the reason is we can not find libonnxruntime_providers_shared.so for some reason in uftrace side

@yihong0618's investigation looks correct. The warning message is printed at https://github.com/microsoft/onnxruntime/blob/v1.17.1/onnxruntime/python/onnxruntime_pybind_state.cc#L2101-L2104.

honggyukim commented 6 months ago

It looks the dlopen at https://github.com/microsoft/onnxruntime/blob/v1.17.1/onnxruntime/core/platform/posix/env.cc#L535-L544 fails.

It's called with the sequence as follows. InitProvidersSharedLibrary() -> ProviderSharedLibrary.Ensure() -> LoadDynamicLibrary() -> dlopen().

honggyukim commented 6 months ago

the reason is we can not find libonnxruntime_providers_shared.so for some reason in uftrace side

The libonnxruntime_providers_shared.so is found at /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi as follows.

$ LD_DEBUG=all ./import-onnxruntime.py
        ...
     53575:      search path=/home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi         (RUNPATH from file /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-310-x86_64-linux-gnu.so)
     53575:       trying file=/home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/libonnxruntime_providers_shared.so
$ file /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/libonnxruntime_providers_shared.so
/home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/libonnxruntime_providers_shared.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=1cc5be76d21480162602a755bcaaaf284b0a0c12, stripped

For some reasons, it looks uftrace blocks searching library from /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi.

honggyukim commented 6 months ago

This part says /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/libonnxruntime_providers_shared.so is searched by RUNPATH.

search path=/home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi
  RUNPATH from file /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-310-x86_64-linux-gnu.so

Maybe we should check whether uftrace bothers searching a library from RUNPATH.

yihong0618 commented 6 months ago

This part says /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/libonnxruntime_providers_shared.so is searched by RUNPATH.

search path=/home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi
  RUNPATH from file /home/honggyu/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-310-x86_64-linux-gnu.so

Maybe we should check whether uftrace bothers searching a library from RUNPATH.

I also found this, but hang here....

yihong0618 commented 5 months ago

@honggyukim yes we can run it with no warning like

LD_LIBRARY_PATH=/home/hyi/.local/lib/python3.9/site-packages/onnxruntime/capi uftrace a.py
yihong0618 commented 4 months ago

@honggyukim yes we can run it with no warning like

LD_LIBRARY_PATH=/home/hyi/.local/lib/python3.9/site-packages/onnxruntime/capi uftrace a.py

seems the same problem