philippe2803 / contentmap

Build a RAG dataset for your domain in just a few lines of codes, using your XML sitemap
https://philippeoger.com/pages/can-we-rag-the-whole-web
32 stars 2 forks source link

AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension' #7

Open medoror opened 1 month ago

medoror commented 1 month ago

As the sqlite database connection attempts to load extensions I am seeing the following failure

Traceback (most recent call last):
  File "/Users/michaeledoror/workspace/web-scraping/.venv/lib/python3.12/site-packages/contentmap/sitemap.py", line 31, in build
    cm = ContentMapCreator(contents, include_vss=self.include_vss)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/michaeledoror/workspace/web-scraping/.venv/lib/python3.12/site-packages/contentmap/core.py", line 22, in __init__
    self.connection.enable_load_extension(True)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension'

I think this is because sqlite3 (coming from my python installation) doesn't have extensions loaded by default.

I attempted to fix this by using this suggestion but this also has not worked for me

Thoughts on what I should be doing to get this work out of the box?

philippe2803 commented 1 month ago

Yes this is a known problem/limit of sqlite-vss. You can try to use the dockerfile in the repo to run things in a container within a linux env to be able to load the extension easily. That's how I avoid this error.

Alternatively, you can wait for a few days until sqlite-vec (new alternative to sqlite-vss) is added to Langchain (PR already created), then it should be a quick change in contentmap lib. That should prevent this error from happening again.

medoror commented 1 month ago

Interesting. When I built the docker image, I unfortunately got the build error below. So yea hopefully sqlite-vec fixes all these problems 😅

output from running docker build -t test-sitemap .

13.56   RuntimeError
13.56 
13.56   Unable to find installation candidates for sqlite-vss (0.1.2)
13.56 
13.56   at /usr/local/lib/python3.10/dist-packages/poetry/installation/chooser.py:74 in choose_for
13.57        70│ 
13.57        71│             links.append(link)
13.57        72│ 
13.57        73│         if not links:
13.57     →  74│             raise RuntimeError(f"Unable to find installation candidates for {package}")
13.57        75│ 
13.57        76│         # Get the best link
13.57        77│         chosen = max(links, key=lambda link: self._sort_key(package, link))
13.57        78│ 
13.57 
13.57 Cannot install sqlite-vss.
13.57 
------
Dockerfile:14
--------------------
  12 |     RUN pip install poetry
  13 |     RUN poetry config virtualenvs.create false
  14 | >>> RUN poetry install
  15 |     
  16 |     RUN python3 -c 'from sentence_transformers import SentenceTransformer; embedder = SentenceTransformer("all-MiniLM-L6-v2")'
--------------------
ERROR: failed to solve: process "/bin/sh -c poetry install" did not complete successfully: exit code: 1