microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.36k stars 3.14k forks source link

The data type of the `Id` member in the `ScoredPoint` class may need to be changed to `int`. #794

Closed AwesomeYuer closed 1 year ago

AwesomeYuer commented 1 year ago

Describe the bug The data type of the Id member in the ScoredPoint class may need to be changed to int.

https://github.com/microsoft/semantic-kernel/blob/562ab5bc63ec4043fe6c797145e326934ce61639/dotnet/src/Connectors/Connectors.Memory.Qdrant/Http/ApiSchema/SearchVectorsResponse.cs#L14

To Reproduce


    public async Task qdrant_SK_Http_HNSW_index_Cosine_50_ProcessAsync()
    {
        QdrantMemoryStore memoryStore =
                new QdrantMemoryStore
                            (
                                GlobalManager.SelfHostQdrantHttpConnectionString
                                , 6333
                                , vectorSize: 4
                                //, ConsoleLogger.Log
                            );
        IKernel kernel = Kernel.Builder
            //.WithLogger(ConsoleLogger.Log)
            .Configure
            (
                c =>
                {
                    c.AddOpenAITextCompletionService("text-davinci-003", "123");
                    c.AddOpenAITextEmbeddingGenerationService("text-embedding-ada-002", "123");
                }
            )
            .WithMemoryStorage(memoryStore)
            .Build();

        var MemoryCollectionName = "small";
       //bug here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
        var searchResults = kernel
                                .Memory
                                .SearchAsync
                                        (
                                            MemoryCollectionName
                                            , "My favorite color is orange"
                                            , limit: 20
                                            , minRelevanceScore: 0.8
                                            , withEmbeddings: true
                                        );

        await foreach (var item in searchResults)
        {
            //Console.WriteLine(item.Metadata.Text + " : " + item.Relevance);
            _ = item.Relevance;
        }
    }

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem. image

image

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

tawalke commented 1 year ago

@AwesomeYuer Qdrant definition for Points object allows for both 64-bit integers and UUID strings

image

image

You can look at the ExtendedPointId object here: https://github.com/qdrant/qdrant/blob/master/src/common/points.rs image

AwesomeYuer commented 1 year ago

@tawalke Thanks for your feedback in advance. I think it would be better to support both int id and string uuid when deserializing as a string Id property from JSON.

Here is some sample code for reference, I have test it passed and succeed when invoke QdrantVectorDbClient.FindNearestInCollectionAsync :

SearchVectorsResponse.cs

// Copyright (c) Microsoft. All rights reserved.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace Microsoft.SemanticKernel.Connectors.Memory.Qdrant.Http.ApiSchema;

#pragma warning disable CA1812 // Avoid uninstantiated internal classes: Used for Json Deserialization
internal sealed class SearchVectorsResponse : QdrantResponse
{
    internal sealed class ScoredPoint
    {
        [JsonPropertyName("id")]
        // add by AwesomeYuer@Microshaoft @ 2023-05-04
        [JsonConverter(typeof(NumberToStringConverter))]
        public string Id { get; }

        [JsonPropertyName("version")]
        public int Version { get; set; }

        [JsonPropertyName("score")]
        public double? Score { get; }

        [JsonPropertyName("payload")]
        public Dictionary<string, object> Payload { get; set; }

        [JsonPropertyName("vector")]
        public IEnumerable<float>? Vector { get; }

        [JsonConstructor]
        public ScoredPoint(string id, double? score, Dictionary<string, object> payload, IEnumerable<float> vector)
        {
            this.Id = id;
            this.Score = score;
            this.Payload = payload;
            this.Vector = vector;
        }
    }

    [JsonPropertyName("result")]
    public IEnumerable<ScoredPoint> Results { get; set; }

    [JsonConstructor]
    public SearchVectorsResponse(IEnumerable<ScoredPoint> results)
    {
        this.Results = results;
    }

    #region private ================================================================================

    private SearchVectorsResponse()
    {
        this.Results = new List<ScoredPoint>();
    }

    #endregion
}
#pragma warning restore CA1812 // Avoid uninstantiated internal classes

// add by AwesomeYuer@Microshaoft @ 2023-05-04
public class NumberToStringConverter : JsonConverter<string>
{
    public override string Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        if (typeToConvert != typeof(string))
        {
            throw new NotSupportedException($"{nameof(typeToConvert)}: {typeToConvert}");
        }
        var result = string.Empty;
        var tokenType = reader.TokenType;
        if
            (
                tokenType == JsonTokenType.String
            )
        {
            result = reader.GetString();
        }
        else if
            (
                tokenType == JsonTokenType.Number
            )
        {
            reader.TryGetInt32(out var valueInt);
            result = valueInt.ToString("D", NumberFormatInfo.InvariantInfo);
        }
        else
        {
            throw new NotSupportedException($"{nameof(JsonTokenType)}.{tokenType}");
        }
        return result!;
    }

    public override void Write(Utf8JsonWriter writer, string @value, JsonSerializerOptions options)
    {
        writer.WriteStringValue(@value);
    }
}
AwesomeYuer commented 1 year ago

good

craigomatic commented 1 year ago

Another simple option we can consider:

[JsonPropertyName("id")]
[JsonNumberHandling(JsonNumberHandling.WriteAsString)]
public string Id { get; }

https://learn.microsoft.com/en-us/dotnet/api/system.text.json.serialization.jsonnumberhandlingattribute

craigomatic commented 1 year ago

Closing as the PR has been merged. Please reopen if this issue persists!

AwesomeYuer commented 1 year ago

glad to hear that