zylon-ai / private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks
https://privategpt.dev
Apache License 2.0
53.62k stars 7.21k forks source link

Deleting 7.5 MB .txt file takes 7 hours #1826

Open Kanishk-Kumar opened 5 months ago

Kanishk-Kumar commented 5 months ago

System specs: Intel(R) Core(TM) i9-14900K GeForce RTX 4070 Ti RAM: 128 GB

Lookup + inference speed is similar to ChatGPT, quite fast. But deleting 7.5 MB .txt file takes 7 hours, ingestion takes ~28 minutes for the same. Tried to reproduce in the log below using 958KB .csv file having one column with clean text (web articles) in each row:

14:33:31.676 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'ollama']
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
14:33:33.455 [INFO    ] private_gpt.components.llm.llm_component - Initializing the LLM in mode=ollama
14:33:33.800 [INFO    ] private_gpt.components.embedding.embedding_component - Initializing the embedding model in mode=ollama
14:33:33.810 [INFO    ] llama_index.core.indices.loading - Loading all indices.
14:33:33.810 [INFO    ] private_gpt.components.ingest.ingest_component - Creating a new vector store index
Parsing nodes: 0it [00:00, ?it/s]
Generating embeddings: 0it [00:00, ?it/s]
14:33:33.890 [INFO    ]         private_gpt.ui.ui - Mounting the gradio UI, at path=/
14:33:33.975 [INFO    ]             uvicorn.error - Started server process [3957544]
14:33:33.975 [INFO    ]             uvicorn.error - Waiting for application startup.
14:33:33.975 [INFO    ]             uvicorn.error - Application startup complete.
14:33:33.975 [INFO    ]             uvicorn.error - Uvicorn running on http://0.0.0.0:5008 (Press CTRL+C to quit)
14:33:35.218 [INFO    ]            uvicorn.access - 192.168.1.2:34448 - "GET / HTTP/1.0" 200
14:33:35.315 [INFO    ]            uvicorn.access - 192.168.1.2:34450 - "GET /info HTTP/1.0" 200
14:33:35.376 [INFO    ]            uvicorn.access - 192.168.1.2:34464 - "GET /theme.css HTTP/1.0" 200
14:33:35.550 [INFO    ]            uvicorn.access - 192.168.1.2:34466 - "POST /run/predict HTTP/1.0" 200
14:33:35.612 [INFO    ]            uvicorn.access - 192.168.1.2:34476 - "POST /queue/join HTTP/1.0" 200
14:33:35.674 [INFO    ]            uvicorn.access - 192.168.1.2:34480 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200
14:34:04.653 [INFO    ]            uvicorn.access - 192.168.1.2:53264 - "POST /upload HTTP/1.0" 200
14:34:04.711 [INFO    ]            uvicorn.access - 192.168.1.2:53272 - "POST /queue/join HTTP/1.0" 200
14:34:04.713 [INFO    ] private_gpt.server.ingest.ingest_service - Ingesting file_names=['cleaned_50_rows.csv']
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00,  8.40it/s]
14:34:04.847 [INFO    ]            uvicorn.access - 192.168.1.2:53282 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200
Generating embeddings: 100%|██████████| 5259/5259 [00:22<00:00, 236.97it/s]
Generating embeddings: 0it [00:00, ?it/s]
Generating embeddings: 0it [00:00, ?it/s]
Generating embeddings: 0it [00:00, ?it/s]
14:34:55.493 [INFO    ] private_gpt.server.ingest.ingest_service - Finished ingestion file_name=['cleaned_50_rows.csv']
14:34:55.587 [INFO    ]            uvicorn.access - 192.168.1.2:52254 - "POST /queue/join HTTP/1.0" 200
14:34:55.644 [INFO    ]            uvicorn.access - 192.168.1.2:52262 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200
14:34:55.706 [INFO    ]            uvicorn.access - 192.168.1.2:52266 - "POST /queue/join HTTP/1.0" 200
14:34:55.762 [INFO    ]            uvicorn.access - 192.168.1.2:52272 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200
14:35:23.746 [INFO    ]            uvicorn.access - 192.168.1.2:53876 - "POST /queue/join HTTP/1.0" 200
14:35:23.807 [INFO    ]            uvicorn.access - 192.168.1.2:53884 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200
14:35:29.932 [INFO    ]            uvicorn.access - 192.168.1.2:34876 - "POST /queue/join HTTP/1.0" 200
14:35:29.980 [INFO    ] private_gpt.server.ingest.ingest_service - Deleting the ingested document=65b3bfc9-aef9-4ddd-9c78-e0468f9dc063 in the doc and index store
14:35:30.001 [INFO    ]            uvicorn.access - 192.168.1.2:34888 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200
14:36:42.276 [INFO    ]            uvicorn.access - 192.168.1.2:45396 - "POST /queue/join HTTP/1.0" 200
14:36:42.338 [INFO    ]            uvicorn.access - 192.168.1.2:45410 - "GET /queue/data?session_hash=5w34ifupiey HTTP/1.0" 200

image Is it supposed to take this long? I see full GPU usage during first "Generating embeddings", then for subsequent "Generating embeddings" both GPU/CPU usage is less than 3%. VRAM usage 350 MB at most. Settings I'm using (settings-ollama.yaml):

server:
  env_name: ${APP_ENV:ollama}
  port: 5008

llm:
  mode: ollama
  max_new_tokens: 32768
  context_window: 32768
  temperature: 0.1

rag:
  similarity_top_k: 20

embedding:
  mode: ollama
  embed_dim: 768
  ingest_mode: simple

ollama:
  llm_model: mistral
  embedding_model: nomic-embed-text
  api_base: http://localhost:11434
  request_timeout: 300.0

vectorstore:
  database: qdrant

qdrant:
  path: local_data/private_gpt/qdrant

Also tried:

embedding:
  mode: ollama
  embed_dim: 768
  ingest_mode: pipeline
  count_workers: 32

And:

server:
  env_name: ${APP_ENV:ollama}
  port: 5008

llm:
  mode: ollama
  max_new_tokens: 32768
  context_window: 32768
  temperature: 0.1

rag:
  similarity_top_k: 20

embedding:
  mode: huggingface
  ingest_mode: pipeline
  embed_dim: 384
  count_workers: 32

huggingface:
  embedding_hf_model_name: BAAI/bge-small-en-v1.5

ollama:
  llm_model: mistral
  api_base: http://localhost:11434
  request_timeout: 300.0

vectorstore:
  database: qdrant

qdrant:
  path: local_data/private_gpt/qdrant

Same wait time in all cases. (Using model with less dimensions is fast, but subsequent "Generating embeddings" still slow.)

Kanishk-Kumar commented 5 months ago

Update: Getting around same wait time for ingestion and exactly same for deletion when both llm and embedding mode is "mock". Also tried only keeping llm as mock, didn't work. I have also tried a different database: poetry install --extras "llms-ollama ui vector-stores-postgres embeddings-ollama storage-nodestore-postgres" Faced same issues there.

17:28:14.186 [INFO    ] private_gpt.server.ingest.ingest_service - Ingesting file_names=['cleaned_50_rows.csv']
Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]17:28:14.237 [INFO    ]            uvicorn.access - 192.168.1.2:51586 - "GET /queue/data?session_hash=skizaephyc HTTP/1.0" 200
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00,  8.14it/s]
Generating embeddings: 100%|██████████| 5259/5259 [00:00<00:00, 303480.11it/s]
17:28:15.617 [INFO    ] private_gpt.components.ingest.ingest_component - Saving 1 files (1 documents / 5259 nodes)
17:28:48.822 [INFO    ] private_gpt.server.ingest.ingest_service - Finished ingestion file_name=['cleaned_50_rows.csv']
17:28:48.903 [INFO    ]            uvicorn.access - 192.168.1.2:34634 - "POST /queue/join HTTP/1.0" 200
17:28:48.959 [INFO    ]            uvicorn.access - 192.168.1.2:34646 - "GET /queue/data?session_hash=skizaephyc HTTP/1.0" 200
17:28:49.022 [INFO    ]            uvicorn.access - 192.168.1.2:34650 - "POST /queue/join HTTP/1.0" 200
17:28:49.077 [INFO    ]            uvicorn.access - 192.168.1.2:34652 - "GET /queue/data?session_hash=skizaephyc HTTP/1.0" 200
17:29:09.919 [INFO    ]            uvicorn.access - 192.168.1.2:46896 - "POST /queue/join HTTP/1.0" 200
17:29:09.974 [INFO    ]            uvicorn.access - 192.168.1.2:46900 - "GET /queue/data?session_hash=skizaephyc HTTP/1.0" 200
17:29:11.599 [INFO    ]            uvicorn.access - 192.168.1.2:46908 - "POST /queue/join HTTP/1.0" 200
17:29:11.615 [INFO    ] private_gpt.server.ingest.ingest_service - Deleting the ingested document=3d7949bf-84d4-42c8-8ad9-b3f48c8d10e1 in the doc and index store
17:29:11.658 [INFO    ]            uvicorn.access - 192.168.1.2:46922 - "GET /queue/data?session_hash=skizaephyc HTTP/1.0" 200
17:30:31.862 [INFO    ]            uvicorn.access - 192.168.1.2:38976 - "POST /queue/join HTTP/1.0" 200
17:30:31.925 [INFO    ]            uvicorn.access - 192.168.1.2:38992 - "GET /queue/data?session_hash=skizaephyc HTTP/1.0" 200