[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_community.document_loaders import TextLoader
import tempfile
import whisper
from pytube import YouTube
# Let's do this only if we haven't created the transcription file yet.
if not os.path.exists("transcription.txt"):
youtube = YouTube(YOUTUBE_VIDEO)
audio = youtube.streams.filter(only_audio=True).first()
# Let's load the base model. This is not the most accurate model but it's fast.
whisper_model = whisper.load_model("base")
with tempfile.TemporaryDirectory() as tmpdir:
file = audio.download(output_path=tmpdir)
transcription = whisper_model.transcribe(file, fp16=False)["text"].strip()
with open("transcription.txt", "w") as file:
file.write(transcription)
documents = TextLoader("transcription.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
db = DocArrayInMemorySearch.from_documents(docs, embeddings)
Error Message and Stack Trace (if applicable)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[27], [line 11](vscode-notebook-cell:?execution_count=27&line=11)
[7](vscode-notebook-cell:?execution_count=27&line=7) docs = text_splitter.split_documents(documents)
[9](vscode-notebook-cell:?execution_count=27&line=9) embeddings = OpenAIEmbeddings()
---> [11](vscode-notebook-cell:?execution_count=27&line=11) db = DocArrayInMemorySearch.from_documents(docs, embeddings)
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\langchain_core\vectorstores.py:550](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_core/vectorstores.py:550), in VectorStore.from_documents(cls, documents, embedding, **kwargs)
[548](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_core/vectorstores.py:548) texts = [d.page_content for d in documents]
[549](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_core/vectorstores.py:549) metadatas = [d.metadata for d in documents]
--> [550](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_core/vectorstores.py:550) return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\langchain_community\vectorstores\docarray\in_memory.py:68](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:68), in DocArrayInMemorySearch.from_texts(cls, texts, embedding, metadatas, **kwargs)
[46](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:46) @classmethod
[47](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:47) def from_texts(
[48](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:48) cls,
(...)
[52](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:52) **kwargs: Any,
[53](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:53) ) -> DocArrayInMemorySearch:
[54](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:54) """Create an DocArrayInMemorySearch store and insert data.
[55](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:55)
[56](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:56) Args:
(...)
[66](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:66) DocArrayInMemorySearch Vector Store
[67](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:67) """
---> [68](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:68) store = cls.from_params(embedding, **kwargs)
[69](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:69) store.add_texts(texts=texts, metadatas=metadatas)
[70](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:70) return store
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\langchain_community\vectorstores\docarray\in_memory.py:39](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:39), in DocArrayInMemorySearch.from_params(cls, embedding, metric, **kwargs)
[21](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:21) @classmethod
[22](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:22) def from_params(
[23](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:23) cls,
(...)
[28](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:28) **kwargs: Any,
[29](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:29) ) -> DocArrayInMemorySearch:
[30](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:30) """Initialize DocArrayInMemorySearch store.
[31](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:31)
[32](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:32) Args:
(...)
[37](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:37) **kwargs: Other keyword arguments to be passed to the get_doc_cls method.
[38](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:38) """
---> [39](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:39) _check_docarray_import()
[40](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:40) from docarray.index import InMemoryExactNNIndex
[42](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/in_memory.py:42) doc_cls = cls._get_doc_cls(space=metric, **kwargs)
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\langchain_community\vectorstores\docarray\base.py:19](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/base.py:19), in _check_docarray_import()
[17](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/base.py:17) def _check_docarray_import() -> None:
[18](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/base.py:18) try:
---> [19](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/base.py:19) import docarray
[21](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/base.py:21) da_version = docarray.__version__.split(".")
[22](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/langchain_community/vectorstores/docarray/base.py:22) if int(da_version[0]) == 0 and int(da_version[1]) <= 31:
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\docarray\__init__.py:5](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/__init__.py:5)
[1](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/__init__.py:1) __version__ = '0.32.1'
[3](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/__init__.py:3) import logging
----> [5](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/__init__.py:5) from docarray.array import DocList, DocVec
[6](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/__init__.py:6) from docarray.base_doc.doc import BaseDoc
[7](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/__init__.py:7) from docarray.utils._internal.misc import _get_path_from_docarray_root_level
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\docarray\array\__init__.py:2](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/__init__.py:2)
[1](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/__init__.py:1) from docarray.array.any_array import AnyDocArray
----> [2](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/__init__.py:2) from docarray.array.doc_list.doc_list import DocList
[3](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/__init__.py:3) from docarray.array.doc_vec.doc_vec import DocVec
[5](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/__init__.py:5) __all__ = ['DocList', 'DocVec', 'AnyDocArray']
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\docarray\array\doc_list\doc_list.py:44](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:44)
[36](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:36) T = TypeVar('T', bound='DocList')
[37](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:37) T_doc = TypeVar('T_doc', bound=BaseDoc)
[40](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:40) class DocList(
[41](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:41) ListAdvancedIndexing[T_doc],
[42](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:42) PushPullMixin,
[43](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:43) IOMixinArray,
---> [44](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:44) AnyDocArray[T_doc],
[45](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:45) ):
[46](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:46) """
[47](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:47) DocList is a container of Documents.
[48](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:48)
(...)
[114](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:114)
[115](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:115) """
[117](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/doc_list/doc_list.py:117) doc_type: Type[BaseDoc] = AnyDoc
File [c:\Users\astec\OneDrive\Documents\RAG_PROJECT\.venv\Lib\site-packages\docarray\array\any_array.py:46](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:46), in AnyDocArray.__class_getitem__(cls, item)
[43](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:43) @classmethod
[44](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:44) def __class_getitem__(cls, item: Union[Type[BaseDoc], TypeVar, str]):
[45](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:45) if not isinstance(item, type):
---> [46](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:46) return Generic.__class_getitem__.__func__(cls, item) # type: ignore
[47](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:47) # this do nothing that checking that item is valid type var or str
[48](file:///C:/Users/astec/OneDrive/Documents/RAG_PROJECT/.venv/Lib/site-packages/docarray/array/any_array.py:48) if not issubclass(item, BaseDoc):
AttributeError: 'builtin_function_or_method' object has no attribute '__func__'
Description
Im trying to use langchain's DocArrayInMemorySearch to create a vector database for my transcription text file, I've written code exactly as it is shown within the LangChain documentation but it does not work
System Info
System Information
OS: Windows
OS Version: 10.0.22631
Python Version: 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)]
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
Description
Im trying to use langchain's DocArrayInMemorySearch to create a vector database for my transcription text file, I've written code exactly as it is shown within the LangChain documentation but it does not work
System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found: