h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
http://h2o.ai
Apache License 2.0
11.4k stars 1.25k forks source link

Embedding model BAAI/bge-large-en #726

Closed llmwesee closed 1 year ago

llmwesee commented 1 year ago

I tried to create embedding of the new document using "BAAI/bge-large-en" instead of "hkunlp/instructor-large" and i used the following cli command for running it:

python generate.py --base_model=meta-llama/Llama-2-13b-chat-hf --score_model=None --langchain_mode='UserData' --user_path=user_path --use_auth_token=True --hf_embedding_model=BAAI/bge-large-en

But the main issue is when i started asking queries from document then it gave the responses from the base model (like llm in collections tab) & not generating any response related to documents and also not showing any sources

Second thing How may i know "BAAI/bge-large-en" embedding mode is used for creating the vector store?

So that I want to know how we can use different different embedding models for Q/A through documents using langchain

pseudotensor commented 1 year ago

https://discord.com/channels/1097462770674438174/1139470419078950984/1139492011418853446

However, your issue is that your unique choice is a new embedding. The value of --cut_distance is tuned for Mini and instructor-large. You can pass --cut_distance=100000 to avoid any filter.

llmwesee commented 1 year ago

(1) I created the embedding using " python3 src/make_db.py --user_path="/home/hemant/Documents/dbtest1" --collection_name=UserData4 --hf_embedding_model=BAAI/bge -large-en" and it created successfully by showing "(<langchain.vectorstores.chroma.Chroma object at 0x7f42054f2140>, 'UserData4')" but when i hit the following command " python generate.py --base_model=meta-llama/Llama-2-13b-chat-hf --score_model=None --langchain_mode=UserData4 --user_path=use r_path --use_auth_token=True --hf_embedding_model=BAAI/bge-large-en --cut_distance=1000000 --max_seq_l en=4096" then It throws "AssertionError: Invalid langchain_mode UserData4" why it happens?

(2) And when i used "langchain_mode=UserData" in the above command then It run successfully but when start querying through documents it throws following error "chromadb.errors.InvalidDimensionException: Dimensionality of (1024) does not match index dimensionality (768)" then why it happens and how to resolve it?

For your reference I attached the screenshot Screenshot from 2023-08-23 09-37-10 Screenshot from 2023-08-23 09-39-53 Screenshot from 2023-08-23 10-18-54

pseudotensor commented 1 year ago

For first issue, you should pass --langchain_modes=['UserData4', 'LLM'] or others there. I can make it easier so if you specify a single mode it'll just use that + LLM or something, but nominally should pass that.

For second issue , can you give a longer stack trace? If you switch embeddings it's supposed to read the embedding info on disk for that db and stick with that embedding. Here it seems to have switched embeddings for an existing db, which is not default behavior and should only be attempted when --migrate_embedding_model=True and then still shouldn't fail.

llmwesee commented 1 year ago

Longer stack trace: The model 'OptimizedModule' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM']. Traceback (most recent call last): File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1431, in process_api result = await self.call_function( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1123, in call_function prediction = await utils.async_iteration(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 349, in async_iteration return await iterator.__anext__() File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 342, in __anext__ return await anyio.to_thread.run_sync( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 325, in run_sync_iterator_async return next(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 694, in gen_wrapper yield from f(*args, **kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2370, in bot for res in get_response(fun1, history): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2319, in get_response for output_fun in fun1(): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gen.py", line 1954, in evaluate for r in run_qa_db( File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2558, in _run_qa_db docs, chain, scores, use_docs_planned, have_any_docs = get_chain(**sim_kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2926, in get_chain docs_with_score = get_docs_with_score(query, k_db, filter_kwargs, db, db_type, verbose=verbose)[ File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2662, in get_docs_with_score docs_with_score = db.similarity_search_with_score(query, k=k_db, **filter_kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 323, in similarity_search_with_score results = self.__query_collection( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/utils/utils.py", line 30, in wrapper return func(*args, **kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 156, in __query_collection return self._collection.query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 230, in query return self._client._query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/local.py", line 439, in _query uuids, distances = self._db.get_nearest_neighbors( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/clickhouse.py", line 591, in get_nearest_neighbors uuids, distances = index.get_nearest_neighbors(embeddings, n_results, ids) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 272, in get_nearest_neighbors self._check_dimensionality(query) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 130, in _check_dimensionality raise InvalidDimensionException( chromadb.errors.InvalidDimensionException: Dimensionality of (1024) does not match index dimensionality (768) The model 'OptimizedModule' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM']. Traceback (most recent call last): File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1431, in process_api result = await self.call_function( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1123, in call_function prediction = await utils.async_iteration(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 349, in async_iteration return await iterator.__anext__() File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 342, in __anext__ return await anyio.to_thread.run_sync( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 325, in run_sync_iterator_async return next(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 694, in gen_wrapper yield from f(*args, **kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2370, in bot for res in get_response(fun1, history): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2319, in get_response for output_fun in fun1(): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gen.py", line 1954, in evaluate for r in run_qa_db( File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2558, in _run_qa_db docs, chain, scores, use_docs_planned, have_any_docs = get_chain(**sim_kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2926, in get_chain docs_with_score = get_docs_with_score(query, k_db, filter_kwargs, db, db_type, verbose=verbose)[ File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2662, in get_docs_with_score docs_with_score = db.similarity_search_with_score(query, k=k_db, **filter_kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 323, in similarity_search_with_score results = self.__query_collection( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/utils/utils.py", line 30, in wrapper return func(*args, **kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 156, in __query_collection return self._collection.query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 230, in query return self._client._query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/local.py", line 439, in _query uuids, distances = self._db.get_nearest_neighbors( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/clickhouse.py", line 591, in get_nearest_neighbors uuids, distances = index.get_nearest_neighbors(embeddings, n_results, ids) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 272, in get_nearest_neighbors self._check_dimensionality(query) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 130, in _check_dimensionality raise InvalidDimensionException( chromadb.errors.InvalidDimensionException: Dimensionality of (1024) does not match index dimensionality (768) The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input'sattention_maskto obtain reliable results. Settingpad_token_idtoeos_token_id:2 for open-end generation. The model 'OptimizedModule' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM']. Traceback (most recent call last): File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1431, in process_api result = await self.call_function( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1123, in call_function prediction = await utils.async_iteration(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 349, in async_iteration return await iterator.__anext__() File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 342, in __anext__ return await anyio.to_thread.run_sync( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 325, in run_sync_iterator_async return next(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 694, in gen_wrapper yield from f(*args, **kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2370, in bot for res in get_response(fun1, history): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2319, in get_response for output_fun in fun1(): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gen.py", line 1954, in evaluate for r in run_qa_db( File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2558, in _run_qa_db docs, chain, scores, use_docs_planned, have_any_docs = get_chain(**sim_kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2926, in get_chain docs_with_score = get_docs_with_score(query, k_db, filter_kwargs, db, db_type, verbose=verbose)[ File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2662, in get_docs_with_score docs_with_score = db.similarity_search_with_score(query, k=k_db, **filter_kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 323, in similarity_search_with_score results = self.__query_collection( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/utils/utils.py", line 30, in wrapper return func(*args, **kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 156, in __query_collection return self._collection.query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 230, in query return self._client._query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/local.py", line 439, in _query uuids, distances = self._db.get_nearest_neighbors( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/clickhouse.py", line 591, in get_nearest_neighbors uuids, distances = index.get_nearest_neighbors(embeddings, n_results, ids) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 272, in get_nearest_neighbors self._check_dimensionality(query) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 130, in _check_dimensionality raise InvalidDimensionException( chromadb.errors.InvalidDimensionException: Dimensionality of (1024) does not match index dimensionality (768) The model 'OptimizedModule' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM']. Traceback (most recent call last): File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1431, in process_api result = await self.call_function( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1123, in call_function prediction = await utils.async_iteration(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 349, in async_iteration return await iterator.__anext__() File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 342, in __anext__ return await anyio.to_thread.run_sync( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 325, in run_sync_iterator_async return next(iterator) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/gradio/utils.py", line 694, in gen_wrapper yield from f(*args, **kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2370, in bot for res in get_response(fun1, history): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gradio_runner.py", line 2319, in get_response for output_fun in fun1(): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gen.py", line 1954, in evaluate for r in run_qa_db( File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2558, in _run_qa_db docs, chain, scores, use_docs_planned, have_any_docs = get_chain(**sim_kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2926, in get_chain docs_with_score = get_docs_with_score(query, k_db, filter_kwargs, db, db_type, verbose=verbose)[ File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gpt_langchain.py", line 2662, in get_docs_with_score docs_with_score = db.similarity_search_with_score(query, k=k_db, **filter_kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 323, in similarity_search_with_score results = self.__query_collection( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/utils/utils.py", line 30, in wrapper return func(*args, **kwargs) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 156, in __query_collection return self._collection.query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 230, in query return self._client._query( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/api/local.py", line 439, in _query uuids, distances = self._db.get_nearest_neighbors( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/clickhouse.py", line 591, in get_nearest_neighbors uuids, distances = index.get_nearest_neighbors(embeddings, n_results, ids) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 272, in get_nearest_neighbors self._check_dimensionality(query) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 130, in _check_dimensionality raise InvalidDimensionException( chromadb.errors.InvalidDimensionException: Dimensionality of (1024) does not match index dimensionality (768)

llmwesee commented 1 year ago

For first issue, you should pass --langchain_modes=['UserData4', 'LLM'] or others there. I can make it easier so if you specify a single mode it'll just use that + LLM or something, but nominally should pass that.

For second issue , can you give a longer stack trace? If you switch embeddings it's supposed to read the embedding info on disk for that db and stick with that embedding. Here it seems to have switched embeddings for an existing db, which is not default behavior and should only be attempted when --migrate_embedding_model=True and then still shouldn't fail.

okk, but when i hit the updated command python generate.py --base_model=meta-llama/Llama-2-13b-chat-hf --score_model=None --user_path=user_path --use_auth_token=True --hf_embedding_model=BAAI/bge-large-en --cut_distance=1000000 --max_seq_len=4096 --migrate_embedding_model=True --langchain_modes=['UserData4', 'LLM']

Now it throws the following error Traceback (most recent call last): File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/generate.py", line 16, in <module> entrypoint_main() File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/generate.py", line 12, in entrypoint_main H2O_Fire(main) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/utils.py", line 57, in H2O_Fire fire.Fire(component=component, command=args) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/hemant/Documents/hemant/h2ogpt_login/.venv/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/home/hemant/Documents/hemant/weseeanjalidemo/h2ogpt/src/gen.py", line 616, in main langchain_modes = ast.literal_eval(os.environ.get("langchain_modes", str(langchain_modes))) File "/usr/lib/python3.10/ast.py", line 64, in literal_eval node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval') File "/usr/lib/python3.10/ast.py", line 50, in parse return compile(source, filename, mode, flags, File "<unknown>", line 1 [UserData4, ^ SyntaxError: '[' was never closed

pseudotensor commented 1 year ago

Quote issue, please use

--langchain_modes="['UserData4', 'LLM']"

or remove the space between.

llmwesee commented 1 year ago

Thanks! Now it works But when querying through the documents using default prompt parameter in Expert tab It generate responses like the following most of the time & also not calculating the score Screenshot from 2023-08-23 14-14-48

Is it because of new embedding model? And if so then how to resolve it?