IntelLabs / RAGFoundry

Framework for enhancing LLMs for RAG tasks using fine-tuning.
https://intellabs.github.io/RAGFoundry/
Apache License 2.0
464 stars 29 forks source link

No Data Found in Qdrant: Troubleshooting Empty Index #6

Closed yoyoyo2025 closed 1 week ago

yoyoyo2025 commented 3 weeks ago

I write a new yaml for retrieval:

Retrieval Augmentation Configuration

name: retrieval_augmentation_experiment cache: false output_path: .

steps:

However, when I open http://127.0.0.1:6333/collections/train, it shows: { "result": { "status": "green", "optimizer_status": "ok", "indexed_vectors_count": 0, "points_count": 0, "segments_count": 8, "config": { "params": { "vectors": { "size": 768, "distance": "Dot", "on_disk": false }, "shard_number": 1, "replication_factor": 1, "write_consistency_factor": 1, "on_disk_payload": true }, "hnsw_config": { "m": 16, "ef_construct": 100, "full_scan_threshold": 10000, "max_indexing_threads": 0, "on_disk": false }, "optimizer_config": { "deleted_threshold": 0.2, "vacuum_min_vector_number": 1000, "default_segment_number": 0, "max_segment_size": null, "memmap_threshold": null, "indexing_threshold": 20000, "flush_interval_sec": 5, "max_optimization_threads": null }, "wal_config": { "wal_capacity_mb": 32, "wal_segments_ahead": 0 }, "quantization_config": null }, "payload_schema": {} }, "status": "ok", "time": 0.000126005 }

I've noticed that there appears to be no data stored in my Qdrant instance, and I'm unsure of the cause. Could you provide some guidance on how to troubleshoot this issue? Thank you very much.

danielfleischer commented 3 weeks ago

Hi, you need to put some data in Qdrant, it's a DB. See my comment regarding indexing a corpus.

If you want a simpler example, without the use of a corpus, see the Pubmed Tutorial.