Open datacrud8 opened 1 year ago
i got the just same as @datacrud8 .
did any one got solved ? thanks in advance.
$ chainlit run main.py -w 2023-10-25 19:38:13 - Loaded .env file 2023-10-25 19:38:22 - Your app is available at http://localhost:8000 2023-10-25 19:38:51 - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2 2023-10-25 19:38:54 - Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. 2023-10-25 19:39:06 - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2 2023-10-25 19:39:07 - Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. Batches: 100%|██████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00, 2.15s/it] 2023-10-25 19:39:20 - 4 changes detected Batches: 100%|██████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 6.11it/s] 2023-10-25 19:41:53 - Number of tokens (513) exceeded maximum context length (512). 2023-10-25 19:41:53 - Number of tokens (514) exceeded maximum context length (512). 2023-10-25 19:41:54 - Number of tokens (515) exceeded maximum context length (512). 2023-10-25 19:41:54 - Number of tokens (516) exceeded maximum context length (512). 2023-10-25 19:41:55 - Number of tokens (517) exceeded maximum context length (512). 2023-10-25 19:41:55 - Number of tokens (518) exceeded maximum context length (512).
hello, can you try a different embeddings model, for example, hkunlp/instructor-large
in the ingest.py file.
@sudarshan-koirala At first, thanks for the answer to my question. i changed this model_name="sentence-transformers/all-MiniLM-L6-v2" ... result is case [A]
into below lists including yours recommend, from by refering (https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models ) .
sentence-transformers/multi-qa-MiniLM-L6-cos-v1 ... result is case [A]
but got no luck ... did i miss something ?
huggingface_embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2", #<-- [A] this cause exceeded maximum context length (512).
model_kwargs={"device": "cpu"},
)
huggingface_embeddings = HuggingFaceEmbeddings(
model_name="hkunlp/instructor-large", #<-- [B] changed to this , throw error below
model_kwargs={"device": "cpu"},
)
$ time python ingest.py
Downloading (…)c7233/.gitattributes: 100%|████████████████████████████████████████████████| 1.48k/1.48k [00:00<00:00, 3.60MB/s]
Downloading (…)_Pooling/config.json: 100%|█████████████████████████████████████████████████████| 270/270 [00:00<00:00, 716kB/s]
Downloading (…)/2_Dense/config.json: 100%|█████████████████████████████████████████████████████| 116/116 [00:00<00:00, 291kB/s]
Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████████████| 3.15M/3.15M [00:00<00:00, 11.1MB/s]
Downloading (…)9fb15c7233/README.md: 100%|█████████████████████████████████████████████████| 66.3k/66.3k [00:00<00:00, 338kB/s]
Downloading (…)b15c7233/config.json: 100%|████████████████████████████████████████████████| 1.53k/1.53k [00:00<00:00, 4.31MB/s]
Downloading (…)ce_transformers.json: 100%|█████████████████████████████████████████████████████| 122/122 [00:00<00:00, 358kB/s]
Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████████████| 1.34G/1.34G [01:56<00:00, 11.5MB/s]
Downloading (…)nce_bert_config.json: 100%|███████████████████████████████████████████████████| 53.0/53.0 [00:00<00:00, 157kB/s]
Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████| 2.20k/2.20k [00:00<00:00, 6.51MB/s]
Downloading spiece.model: 100%|█████████████████████████████████████████████████████████████| 792k/792k [00:00<00:00, 11.9MB/s]
Downloading (…)c7233/tokenizer.json: 100%|████████████████████████████████████████████████| 2.42M/2.42M [00:01<00:00, 2.36MB/s]
Downloading (…)okenizer_config.json: 100%|████████████████████████████████████████████████| 2.41k/2.41k [00:00<00:00, 7.06MB/s]
Downloading (…)15c7233/modules.json: 100%|████████████████████████████████████████████████████| 461/461 [00:00<00:00, 1.37MB/s]
Traceback (most recent call last):
File "/home/ctxwing/docker-ctx/lancer/basic/llama2-chat-with-documents/ingest.py", line 82, in
real 2m12.113s user 0m14.215s sys 0m9.627s
As per new updates, define like this:-
llm = CTransformers( model=model_path, model_type=model_type, config={'max_new_tokens': 1024, 'temperature': 0.7, 'context_length': 4096} )
hi, trying to build this app in local, and used same model llama-2-7b-chat.ggmlv3.q8_0.bin when run the app UI showing some random message same like you showed but checking in console getting this below message:
Number of tokens (755) exceeded maximum context length (512). Number of tokens (756) exceeded maximum context length (512). Number of tokens (757) exceeded maximum context length (512).
so increased max_new_tokens=2048, and increased n_ctx and added truncate=True , non of them are fixing this issue. Changed the model as well. still same issue.
do you know any solution for this issue?