TheoKanning / openai-java

OpenAI Api Client in Java
MIT License
4.68k stars 1.16k forks source link

Add support for passing dimension size while calling OpenAI's create dimension API #469

Open prabhupant opened 4 months ago

prabhupant commented 4 months ago

Currently in the EmbeddingRequest, we cannot specify the dimension size.

package com.theokanning.openai.embedding;

import lombok.*;

import java.util.List;

/**
 * Creates an embedding vector representing the input text.
 *
 * https://beta.openai.com/docs/api-reference/embeddings/create
 */
@Builder
@NoArgsConstructor
@AllArgsConstructor
@Data
public class EmbeddingRequest {

    /**
     * The name of the model to use.
     * Required if using the new v1/embeddings endpoint.
     */
    String model;

    /**
     * Input text to get embeddings for, encoded as a string or array of tokens.
     * To get embeddings for multiple inputs in a single request, pass an array of strings or array of token arrays.
     * Each input must not exceed 2048 tokens in length.
     * <p>
     * Unless you are embedding code, we suggest replacing newlines (\n) in your input with a single space,
     * as we have observed inferior results when newlines are present.
     */
    @NonNull
    List<String> input;

    /**
     * A unique identifier representing your end-user, which will help OpenAI to monitor and detect abuse.
     */
    String user;
}

So the embedding size that we will get will always be 1536. If we want to generate embeddings with a reduced size, currently this is not possible

StefanBratanov commented 4 months ago

A shameless plug, but I regularly update another Java OpenAI library - https://github.com/StefanBratanov/jvm-openai . It has the ability to specify the dimension for the Embedding requests.

prabhupant commented 4 months ago

If anyone hasn't taken up this issue, I am working on it. Will be submitting the pull request shortly.