Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
https://anythingllm.com
MIT License
26.96k stars 2.71k forks source link

[BUG]: LanceDBError Append with different schema when embedding #779

Closed calmtortoise closed 8 months ago

calmtortoise commented 8 months ago

How are you running AnythingLLM?

Docker (local)

What happened?

When saving/embedding a document to the workspace I get the following error

LanceDBError: Append with different schema.

This only seems to occur with certain documents. I have successfully uploaded and embedded multiple documents and it will embed those into any Workspace but not specific ones. I have tried a variety of different file types.

Are there known steps to reproduce?

  1. Drag and drop a file and wait until uploaded.
  2. Select and add to list of documents to embed.
  3. Click on Save/Embed.
calmtortoise commented 8 months ago

Update: I was able to fix the problem by deleting the old workspace and creating a new one. No idea why that worked.

timothycarambat commented 8 months ago

When you had this issue, what kind of files did you have embedded into the workspace?

calmtortoise commented 8 months ago

Prior to getting the error there were two PDF files embedded.

timothycarambat commented 8 months ago

Hm, currently unable to replicate. The deleting/re-creating is a suitable workaround until we can have a known replication of this elusive bug.

novadiem commented 8 months ago

Failed to add. LanceDBError: Append with different schema

I had embedded a collection of webpages, and 4 pdf files. On trying to embed another 2 pdf files, I got the above error. Removed all html embeds and tried to embed a single new pdf and still get the following: image

Then removed all docs from workspace and it still wont let me embed anything new... It won't even let me re-add the previously embedded files without given the above error.

Can confirm a new workspace does allow for re-attachment and use of new embeds.. so seems workplace specific.

(side note, would love to have ability to scrape entire site instead of a single url at a time!)

timothycarambat commented 8 months ago

@novadiem Are you on Docker or Desktop?

novadiem commented 8 months ago

Desktop - W11 Pro

timothycarambat commented 8 months ago

We have found the issue causing this on the desktop app. It will be patched in the next update which is to be release tomorrow. The error is from the lanceDB provider library and it just needed to be patched to most recent version.

timothycarambat commented 8 months ago

@calmtortoise The issue here says Docker, but this issue should not be on Docker - was your original issue on a Docker instance and not the desktop app?

timothycarambat commented 8 months ago

Resolved in latest desktop version (1.2.0). Was bug with LanceDB package