AI-Northstar-Tech / vector-io

The only Vector tooling you'll need. Star the repo and look out for an email to try out a brand new Vector Data Exploration demo! Use the universal VDF format for vector datasets to easily export and import data from all vector databases, and re-embed it using any model
https://tryvector.io
Apache License 2.0
201 stars 26 forks source link

Qdrant import / collection not working #93

Open tkreuder opened 4 months ago

tkreuder commented 4 months ago

Details

curl -L -X POST 'http://localhost:6333/collections/my_imported_collection/points/search' -H 'Content-Type: application/json' \                                                                                                                                                                                      
                                                                                        --data-raw '{
                                                                                   "vector": [0.69,0.69,0.59,0.74,0.18,0.44,0.91,0.76,0.35,0.8,0.31,0.61,0.62,0.49,0.85,0.41,0.26,0.43,0.09,0.34,0.41,0.07,0.16,0.75,0.24,0.87,0.89,0.29,0.43,0.78,0.55,0.78,0.97,0.28,0.68,0.44,0.01,0.41,0.63,0.64,0.41,0.69,0.36,0.92,0.29,0.52,0.37,0.49,0.83,0.53,0.24,0.04,0.78,0.7,0.04,0.48,0.81,
0.33,0.39,0.24,0.26,0.68,0.26,0.02,0.69,0.38,0.76,0.67,0.65,0.83,0.28,0.98,0.59,0.49,0.67,0.42,0.4,0.11,0.1,0.94,0.89,0.45,0.73,0.87,0.76,0.7,0.06,0.45,0.95,0.27,0.3,0.84,0.11,0.06,0.6,0.94,0.56,0.68,0.99,0.33,0.22,0.71,0.49,0.6,0.84,0.63,0.71,0.1,0.96,0.59,0.9,0.92,0.5,0.03,0.65,0.39,0.96,0.72,0.87,0.02,0.38,0.61,0.91,0.34,0.95,0.47,0.82,0.81,0.47,0.31,0.72,0.27,0.88,0.47,0
.51,0.04,0.5,0.99,0.09,0.77,0.13,0.5,0.78,0.33,0.26,0.28,0.1,0.27,0.27,0.92,0.15,0.49,0.88,0.55,0.4,0.2,0.61,0.6,0.88,0.29,0.48,0.1,0.41,0.21,0.41,0.27,0.73,0.25,0.52,0.89,0.94,0.28,0.16,0.17,0.03,0.34,0.85,0.52,0.83,0.38,0.17,0.57,0.28,0.29,0.94,0.89,0.46,0.81,0.93,0.88,0.07,0.62,0.6,0.75,0.63,0.63,0.37,0.2,0.41,0.15,0.34,0.39,0.96,0.63,0.02,0.53,0.23,0.2,0.11,0.09,0.4,0.42
,0.64,0.68,0.78,0.13,0.45,0.47,0.32,0.95,0.66,0.41,0.02,0.22,0.16,0.1,0.51,0.31,0.3,0.41,0.85,0.36,0.85,0.27,0.26,0.96,0.73,0.62,0.99,0.96,0.47,0.3,0.78,0.53,0.46,0.24,0.39,0.58,0.64,0.4,0.68,0.02,0.86,0.04,0.38,0.11,0.69,0.9,0.4,0.88,0.14,0.96,0.65,0.74,0.32,0.59,0.83,0.22,0.91,0.55,0.7,0.87,0.23,0.19,0.94,0.98,0.16,0.86,0.76,0.18,0.43,0.18,0.69,0.07,0.31,0.52,0.93,0.91,0.9
7,0.32,0.15,0.98,0.23,0.36,0.48,0.18,0.56,0.77,0.21,0.87,0.65,0.1,0.3,0.52,0.57,0.9,0.65,0.62,0.94,0.96,0.33,0.24,0.89,0.02,0.75,0.77,0.1,0.75,0.84,0.49,0.15,0.19,0.37,0.12,0.2,0.56,0.99,0.44,0.74,0.08,0.52,0.36,0.07,0.07,0.78,0.8,0.39,0.79,0.58,0.16,0,0.46,0.38,0.05,0.26,0.18,0.27,0.21,0.57,0.07,0.86,0.54,0.31,0.25,0.14,0.56,0.14,0.98,0.06,0.14,0.76,0.6,0.93,0.58,0.9,0.18,0
.46,0.33,0.27,0.34,0.89,0.27,0.69,0.89,0.41,0.05,0.07,0.28,0.28,0.17,0.88,0.62,0.81,0.92,0.05,0.13,0.11,0.82,0.23,0.96,0.88,0.07,0.71,0.12,0.38,0.11,0.1,0.94,0.63,0.38,0.25,0.54,0.85,0.93,0.65,0.33,0.52,0.6,0.99,0.24,0.47,0.09,0.94,0.65,0.44,0.52,0.35,0.24,0.66,0.59,0.59,0.68,0.37,0.3,0.22,0.28,0.25,0.4,0.6,0.98,0.94,0.88,0.33,0.94,0.59,0.2,0.48,0.96,0.52,0.56,0.13,0.1,0.05,
0.14,0.97,0.14,0.35,0.67,0.36,0.22,0.58,0.29,0.85,0.07,0.18,0.77,0.5,0.13,0.51,0.11,0.92,0.53,0.34,0.85,0.63,0.7,0.07,0.31,0.12,0.64,0.47,0.56,0.17,0.54,0.68,0.95,0.2,0.3,0.12,0.42,0.83,0.42,0.23,0.34,0.7,0.66,0.77,0.15,0.97,0.22,0.26,0.6,0.99,0.67,0.1,0.82,0.03,0.72,0.8,0.88,0.59,0.71,0.77,0.65,0.88,0.59,0.6,0.24,0.22,0.08,0.17,0.69,0.34,0.63,0.24,0.94,0.17,0.12,0.08,0.12,0
.18,0.44,0.16,0.25,0.91,0.41,0.72,0.66,0.63,0.86,0.94,0.1,0.91,0.52,0.17,0.13,0.23,0.97,0.29,0.12,1,0.95,0.7,0.31,0.09,0.68,0.18,0.82,0.88,0.11,0.43,0.75,0.51,0.77,0.41,0.87,0.3,0.39,0.99,0.27,0.44,0.42,0.84,0.86,0.77,0.28,0.26,0.62,0.99,0.32,0.41,0.45,0.95,0.85,0.13,0.28,0.18,0.1,0.54,0.37,0.86,0.86,0.79,0.36,0.64,0.98,0.79,0.07,1,0.87,0.4,0.04,0.24,0.78,0.91,0.41,0.98,0.63
,0.73,0.49,0.35,0.55,0.87,0.24,0.26,0.14,0.67,0.1,0.97,0.09,0.44,0.04,0.87,0.24,0.4,0.49,0.32,0.5,0.44,0.91,0.57,0.67,0.39,0.22,0.81,0.28,0.14,0.73,0.93,0.27,0.63,0.67,0.06,0.89,0.8,0.82,0.09,0.81,0.32,0.16,0.19,0.71,0.03,0.84,0.19,0.89,0.83,0.52,0.61,0.79,0.2,0.54,0.64,0.13,0.46,0.24,0.15,0.84,0.31,0.43,0.52,0.4,0.49,0.32,0.63,0.05,0.69,0.67,0.48,0.94,0.89,0.02,0.28,0.4,0.3
4,0.34,0.03,0.02,0.65,0.85,0.78,0.72,0.47,0.67,0.32,0.03,0.11,0.27,0.8,0.59,0.58,0.83,0.92,0.87,0.95,0.61,0.06,0.72,0.13,0.63,0.47,0.59,0.1,0.95,0.75,0.37,0.63,0.99,0.58,0.1,0.37,0.8,0.36,0.05,0.92,0.76,0.41,0.99,0.03,0.06,0.09,0.28,0.86,0.99,0.69,0.52,0.62,0.79,0.33,0.12,0.7,0.58,0.56,0.02,0.46,0.77,0.88,0.09,0.37,0.51,0.87,0.45,0.5,0.45,0.42,0.43,0.64,0.66,0.13,0.94,0.35,0
.35,0.62,0.61,0.64,0.8,0.18,0.59,0.77,0.02,0.54,0.94,0.56,0.26,0.97,0.55,0.71,0.08,0.89,0.82,0.26,0.24,0.18,0.27,0.77,0.79,0.18,0.03,0.68,0.56,0.63,0.08,0.15,0.76,0.91,0.41,0.94,0.44,0.78,0.95,0.52,0.68,0.45,0.16,0.29,0.08,0.86,0.26,0.65,0.21,0.39,0.77,0.45,0.19,0.81,0.48,0.9,0.55,0.42,0.03,0.37,0.72,0.99,0.4,0.01,0.86,0.96,0.69,0.67,0.12,0.85,0.79,0.95,0.19,0.46,0.57,0.76,0
.82,0.63,0.51,0.86,0.29,0.05,0.24,0.42,0.99,0.13,0.55,0.73,0.63,0.24,0.94,0.67,0.34,0.79,0.47,0.89,0.98,0.98,0.99,0.32,0.93,0.42,0.95,0.74,0.64,0.33,0.98,0.64,0.35,0.91,0.4,0.9,0.97,0.22,0.82,0.65,0.47,0.61,0.94,0.24,0.42,0.64,0.24,0.63,0.17,0.39,0.42,0.31,0.84,0.34,0.41,0.78,0.13,0.02,0.63,0.87,0.58,0.37,0.62,0.31,0.9,0.02,0.47,0.12,0.39,0.84,0.96,0.55,0.24,0.21,0,0.96,0.56
,0.31,0.85,0.28,0.61,0.35,0.76,0.93,0.37,0.73,0.07,0.08,0.36,0.21,0.77,0.75,0.53,0.45,0.21,0.79,0.57,0.67,0.74,0.39,0.97,0.46,0.41,0.24,0.85,0,0.51,0.19,0.4,0.41,0.48,0.64,0.86,0.75,0.79,0.65,0.4,0.59,0.69,0.44,0.77,0.85,0.95,0.06,0.94,0.61,0.2,0.19,0.21,0.32,0.38,0.99,0.58,0.38,0.16,0.9,0.2,0.18,0.16,0.78,0.41,0.22,0.57,0.88,0.78,0.56,0.91,0.12,0.14,0.74,0.97,0.68,0.32,0.8,
0.56,0.93,0.15,0.4,0.93,0.33,0.28,0.67,0.5,0.93,0.89,0.54,0.79,0.87,0.74,0.25,0.66,0.42,0.03,0.24,0.06,0.36,0.54,0.41,0.05,0.05,0.39,0.24,0.15,0.89,0.67,0.84,0.47,0.3,0.9,0.79,0.58,0.1,0.95,0.49,0.78,0.75,0.93,0.42,0.53,0.17,0.12,0.01,0.34,0.65,0.46,0.32,0.04,0.15,0.03,0.6,0.29,0.07,0.87,0.29,0.6,0.6,0.15,0.14,0.29,0.52,0.34,0.58,0.44,0.85,0.23,0.08,0.26,0.21,0.02,0.4,0.89,0
.8,0.66,0.65,0.28,0.56,0.14,0.36,0.61,0.22,0.85,0.32,0.83,0.44,0.44,0.72,0.85,0.01,0.23,0.51,0.65,0.22,0.58,0.99,0.31,0.34,0.85,0.45,0.78,0.49,0.94,0.36,0.95,0.15,0.86,0.96,0.01,0.72,0.08,0.92,0.22,0.57,0.39,0.31,0.47,1,0.34,0.29,0.82,0.31,0.44,0.07,0.5,0.25,0.25,0.77,0.03,0.66,0.96,0.46,0.52,0.1,0.52,0.92,0.61,0.07,0.24,0.81,0.73,0.81,0,0.12,0.92,0.9,0.37,0.49,0.4,0.98,0.58
,0.35,0.59,0.11,0.05,0.17,0.61,0.85,0.07,0.33,0.5,0.97,0.87,0.22,0.63,1,0.1,0.28,0.15,0.49,0.5,0.25,0.68,0.8,0.24,0,0.05,0.59,0.54,0.88,0.49,0.17,0.36,0.03,0.46,0.03,0.31,0.46,0.39,0.55,0.05,0.92,0.42,0.37,0.87,0.49,0.52,0.21,0.2,0.86,0.49,0.75,0.97,0.65,0.57,0.74,0.72,0.77,0.51,0.04,0.02,0.08,0.52,0.5,0.66,0.39,0.85,0.63,0.11,0.37,0.51,0.85,0.03,0.19,0.65,0.65,0.33,0.28,0.7
1,0.68,0.88,0.31,0.99,0.52,0.08,0.83,0.59,0.22,0.31,0.86,0.44,0.59,0.4,0.61,0.33,0.3,0.23,0.83,0.02,0.8,0.98,0.11,0.35,0.48,0.45,0.58,0.23,0.18,0.23,0.36,0.16,0.35,0.88,0.25,0.39,0.9,0.03,0.81,0.69,0.2,0.36,0.61,0.85,0.17,0.43,0.5,0.57,0.95,0.94,0.67,0.27,0.42,0.76,0.63,0.41,0.46,0.99,0.37,0.56,0.21,0.32,0.98,0.7,0.7,0.32,0.93,0.34,0.84,0.34,0.94,0.18,0.31,0.07,0.59,0.98,0.9
1,0.3,0.06,0.17,0.73,0.72,0.93,0.5,0.04,0.22,0.54,0.06,0.21,0.09,0.77,0.67,0.94,0.39,0.66,0.93,0.03,0.11,0.92,0.47,0.81,0.38,0.03,0.06,0.84,0.66,0.01,0.82,0.68,0.86,0.97,0.15,0.99,0.29,0.56,0.41,0.4,0.11,0.65,0.34,0.47,0.1,0.22,0.56,0.32,0.74,0.18,0.15,0.42,0.69,0.18,0.17,0.07,0.98,0,0.86,0.92,0.56,0.53,0.12,0.01,0.58,0.72,0.28,0.59,0.17,0.95,0.21,0.29,0.78,0.45,0.19,0.26,0.
8,0.29,0.33,0.48,0.53,0.73,0.44,0.95,0.9,0.49,0.32,0,0.37,0.72,0.29,0.3,0.09,0.91,0.25,0.51,0.23,0.27,0.86,0.73,0.4,0.63,0.7,0.5,0.03,0.46,0.66,0.13,0.28,0.22,0.77,0.24,0.19,0.59,0.32,0.12,0.28,0.83,0.45,0.96,0.14,0.45,0.93,0.28,0.46,0.97,0.4,0.94,0.57,0.87,0.57,0.22,0.35,0.9,0.34,0.41,0.7,0.35,0.38,0.04,0.27,0.25,0.69,0.02,0.91,0.35,0.76,0.62,0.46,0.49,0.46,0.45,0.9,0.1,0.3
2,0.09,0.91,0.13,0.87,0.83,0.06,0.84,0.1,0.97,0.11,0.31,0.18,0.02,0.76,0.51,0.5,0.17,0.61,0.12,0.25,0.51,0.65,0.01,0.93,0.59,0.27,0.35,0.22,0.43,0.02,0.7,0.55,0.9,0.37,0.92,0.41,0.32,0.21,0.57,0.49,0.64,0.54,0.85,0.98,0.87,0.14,0.43,0.15,0.04,0.71,0.01,0.43,0.1,0.72,0.32,0.96,0.34,0.83,0.72,0.96,0.82,0.07,0.95,0,0.51,0.15,0.43,0.8,0.57,0.11,0.27,0.14,0.56,0.01,0.03,0.04,0.99
,0.92,0.49,0.39,0.64,0.13,0.82,0.66,0.1,0.94,0.47,0.61,0.3,0.3], "top": 3 }'
{"status":{"error":"Wrong input: Vector params for  are not specified in config"},"time":0.001418866}

the config.json of the collection looks strange to me:


{
    "params": {
        "vectors": {
            "vector": {
                "size": 1536,
                "distance": "Cosine"
            }
        },
        "shard_number": 1,
        "replication_factor": 1,
        "write_consistency_factor": 1,
        "on_disk_payload": true
    },
    "hnsw_config": {
        "m": 16,
        "ef_construct": 100,
        "full_scan_threshold": 10000,
        "max_indexing_threads": 0,
        "on_disk": false
    },
    "optimizer_config": {
        "deleted_threshold": 0.2,
        "vacuum_min_vector_number": 1000,
        "default_segment_number": 0,
        "max_segment_size": null,
        "memmap_threshold": null,
        "indexing_threshold": 20000,
        "flush_interval_sec": 5,
        "max_optimization_threads": null
    },
    "wal_config": {
        "wal_capacity_mb": 32,
        "wal_segments_ahead": 0
    },
    "quantization_config": null
}

Probably the reason for this issue is: "vectors": { "vector": { ... instead of

{
    "params": {
        "vectors": {
                "size": 1536,
                "distance": "Cosine"
        },
...}

Branch

No response

Checklist - [X] Modify `src/vdf_io/import_vdf/qdrant_import.py` ✓ https://github.com/AI-Northstar-Tech/vector-io/commit/01405c0d8695fcdffbab2a783bfa29199c2868c5 [Edit](https://github.com/AI-Northstar-Tech/vector-io/edit/sweep/qdrant_import_collection_not_working/src/vdf_io/import_vdf/qdrant_import.py)
sweep-ai[bot] commented 4 months ago

🚀 Here's the PR! #94

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: 4725fe3daf)

[!TIP] I can email you next time I complete a pull request if you set up your email here!


Actions (click)


Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description. https://github.com/AI-Northstar-Tech/vector-io/blob/ad971da78fbc1e01f4977f5d6c2d8c7eaf0149ae/src/vdf_io/import_vdf/qdrant_import.py#L1-L469

Step 2: ⌨️ Coding

vectors_config = { vector_column_name: VectorParams( size=dims, distance=distance, on_disk=on_disk, ) for vector_column_name in vector_column_names } vectors_config = { vector_column_name: VectorParams( size=dims, distance=distance, ) for vector_column_name in vector_column_names }

Remove the nested "vector" key and specify the vector configuration directly under the "vectors" key, with the vector column name as the key and the VectorParams object as the value. Also remove the on_disk parameter as it is not part of the VectorParams configuration.


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/qdrant_import_collection_not_working.


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description. Something wrong? Let us know.

This is an automated message generated by Sweep AI.

greptile-apps[bot] commented 4 months ago

The issue seems to stem from the collection configuration format used during the import process. Specifically, the vectors_config setup in the upsert_data method of qdrant_import.py expects a dictionary with keys corresponding to vector column names and their configurations. However, your collection's config.json indicates a mismatch in expected structure, particularly under the params -> vectors section. To resolve this, ensure the collection configuration passed to self.client.create_collection within upsert_data matches Qdrant's expected format. This involves adjusting the vectors_config dictionary construction to align with your collection's actual vector dimension and distance metric, ensuring it accurately reflects the structure shown in your issue description.

References

/src/vdf_io/import_vdf/qdrant_import.py

Ask Greptile

dhruv-anand-aintech commented 4 months ago

The

"params": {
        "vectors": {
            "vector": {
                "size": 1536,
                "distance": "Cosine"
            }
        },
        ...

is matching the format for named vectors.

I want to better understand the sequence of your operations. Did you first import a vdf dataset into your qdrant instance, and then try to do a search via the REST API?