tmc / langchaingo

LangChain for Go, the easiest way to write LLM-based programs in Go
https://tmc.github.io/langchaingo/
MIT License
4.85k stars 633 forks source link

question: why not support add Fields of a vector database #817

Open xiaolibuzai-ovo opened 6 months ago

xiaolibuzai-ovo commented 6 months ago

As the title suggests, I want to know why some fields are specified when creating a vector database collection and users are not allowed to add others when building a Knowledge Base through a vector database. Is there any design concept here? For example, if I want to store different context vector fields in a collection, I should add a field to distinguish which context these vector data belong to (such as conversationId).

such as milvus

func (s *Store) createCollection(ctx context.Context, dim int) error {
    if dim == 0 || s.collectionExists {
        return nil
    }
    s.schema = &entity.Schema{
        CollectionName: s.collectionName,
        AutoID:         true,
        Fields: []*entity.Field{
            {
                Name:       s.primaryField,
                DataType:   entity.FieldTypeInt64,
                AutoID:     true,
                PrimaryKey: true,
            },
            {
                Name:     s.textField,
                DataType: entity.FieldTypeVarChar,
                TypeParams: map[string]string{
                    entity.TypeParamMaxLength: strconv.Itoa(s.maxTextLength),
                },
            },
            {
                Name:     s.metaField,
                DataType: entity.FieldTypeVarChar,
                TypeParams: map[string]string{
                    entity.TypeParamMaxLength: strconv.Itoa(s.maxTextLength),
                },
            },
            {
                Name:     s.vectorField,
                DataType: entity.FieldTypeFloatVector,
                TypeParams: map[string]string{
                    entity.TypeParamDim: strconv.Itoa(dim),
                },
            },
        },
    }

    err := s.client.CreateCollection(ctx, s.schema, s.shardNum, client.WithMetricsType(s.metricType))
    if err != nil {
        return err
    }
    s.collectionExists = true
    return nil
}

Thanks for you answer!

CrazyWr commented 6 months ago

I think it is a bug, it should create one separate field for each entry in metadata, if you put conversationId into MetaData, it should create conversationId filed automatically

python version implementation:

image
xiaolibuzai-ovo commented 6 months ago

I think it is a bug, it should create one separate field for each entry in metadata, if you put conversationId into MetaData, it should create conversationId filed automatically

python version implementation: image

In this case, I roughly understand. Does that mean there are bugs when querying? All fields in the metadata should also be supported