Open abhishekrai43 opened 1 year ago
load INSTRUCTOR_Transformer
max_seq_length 512
Using embedded DuckDB with persistence: data will be stored in: db
CUDA extension not installed.
The safetensors archive passed at C:\Users\Administrator/.cache\huggingface\hub\models--TheBloke--Llama-2-7b-Chat-GPTQ\snapshots\b7ee6c20ac0bba85a310dc699d6bb4c845811608\gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the save_pretrained
method. Defaulting to 'pt' metadata.
skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet.
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
llm: gptq
gptq: model: TheBloke/Llama-2-7b-Chat-GPTQ model_file: C:\Users\Administrator\Downloads\gptq_model-4bit-128g.safetensors device: 0 pipeline_kwargs: max_new_tokens: 256
download: false
host: localhost port: 5000 auth: false
chroma: persist_directory: db chroma_db_impl: duckdb+parquet anonymized_telemetry: false
retriever: search_kwargs: k: 4 , could you please guide me a little on how to get this "lighting fast" performance out of this? I am on a Tesla T4 AWS EC-2. If there is no other option than to move to linux, I will do that as well, but it will increase my work manyfold, so tell me if there is a way to get this done on the same machine please.
Got the same error: File "/media/vmirea/NTFS_8TB/projects/chatdocs/venv/lib/python3.11/site-packages/chromadb/db/index/hnswlib.py", line 240, in get_nearest_neighbors raise NoIndexException( chromadb.errors.NoIndexException: Index not found, please create an instance before querying
(venv) (base) vmirea@vmirea-Z390-GAMING-SLI:/media/vmirea/NTFS_8TB/projects/chatdocs$ chatdocs ui load INSTRUCTOR_Transformer max_seq_length 512 Using embedded DuckDB with persistence: data will be stored in: db
Not sure what is wrong here. the db and index folders are at their default locations. Even so I copied them to each and evey folder possible. But still it is giving the same error.