unbody-io / ts-client

Typescript client for Unbody's API
https://unbody.io/docs/libraries/typescript-client
6 stars 1 forks source link

TypeScript Client Implementation for Unbody's GraphQL API (v1) #2

Open miladazhdehnia opened 1 year ago

miladazhdehnia commented 1 year ago

Description:

We need to develop a TypeScript client implementation for Unbody's GraphQL API version 1. This implementation will allow us to communicate effectively with the API, leveraging TypeScript's type-checking and autocompletion features.

Goal:

The goal of this issue is to create a robust TypeScript client library that provides a convenient and intuitive interface for developers to interact with Unbody's GraphQL API. This library should support all operations offered by the API.

Implementation Steps:

  1. Set up the project structure and dependencies.
  2. Define the GraphQL schema and types for the API.
  3. Implement the necessary HTTP requests (e.g., GET, POST) to fetch data from the API endpoints.
  4. Create functions or classes that encapsulate the API operations, making it easy for developers to consume and interact with the API.
  5. Ensure proper error handling and error messages for different scenarios.
  6. Write comprehensive unit tests to verify the functionality and stability of the client library.
  7. Document the client library usage, including installation instructions, API documentation, and usage examples.
  8. Anything else?

An example of expected result:

const unbody = new Unbody(UNBODY_API_KEY, UNBODY_LPE_PROJECT_ID);
const response = await unbody.get.googleDoc
    .where({title: 'Bitcoin'})
    .nearText(['cryptocurrency'])
    .select(['title'])
    .exec()
miladazhdehnia commented 1 year ago

@amirhouieh Since we have Weaviate as our source of operations, we should clarify the operations we need. It would be helpful to provide a list of operation names if possible.

jeangovil commented 1 year ago

@miladazhdehnia, regarding the proposed interface, it's not clear how nested where operators could be constructed with chained methods. Could you please provide an example that would yield the following GraphQL query?

query {
  Get {
    TextBlock(
      where: {
        operator: And
        operands: [
          {
            operator: Or
            operands: [
              {
                path: ["document", "GoogleDoc", "path"]
                valueString: "Podcasts"
                operator: Equal
              }
              {
                path: ["document", "GoogleDoc", "path"]
                valueString: "Articles"
                operator: Equal
              }
            ]
          }
          {
            operator: Or
            operands: [
              {
                path: ["document", "GoogleDoc", "path"]
                valueString: "published"
                operator: Equal
              }
              {
                path: ["document", "GoogleDoc", "path"]
                valueString: "highlighted"
                operator: Equal
              }
            ]
          }
        ]
      }
    ) {
      text
    }
  }
}
  1. To answer your question regarding operators, the client should support all Weaviate's operations for both Get, and Aggregate queries for all Unbody objects(e.g., GoogleDoc, TextBlock, etc.) You can find the list of operations in Unbody's GraphQL schema. For example, the following methods should be supported for the Get query:

    where()
    ask()
    bm25()
    hybrid()
    nearText()
    nearVector()
    nearObject()
    group()
    groupBy()
    sort()
    offset()
    limit()
  2. Some methods, such as ask, require an _additional field in the GraphQL query to get the answer. To make it easier for developers, we should add a default _additional field to the query in the client. However, we should also provide an .additional method in case developers need to override the default behavior.

Example:

This method should yield the following GraphQL query:

unbody.googleDoc.get.ask("What is Unbody's core feature?", "text")
query {
   Get {
     GoogleDoc(
       ask:{
         question: "What is Unbody's core feature?"
         properties: "text"
       }
     ){
       _additional{
         answer{
           result
         }
       }
     }
   }
}
  1. We should aim to simplify our methods wherever possible. For instance, methods such as ask should have multiple types of arguments for different methods.

Example:

Developer should be able to construct ask in two different ways

unbody.gdoc.get.ask("What is Unbody's core feature?", "text")

// or
unbody.gdoc.get.ask({
    question: "What is Unbody's core feature?"
    properties: "text"
})
  1. The select method should support nested objects. Here is the suggested syntax:
    
    .select(
    GDOCProps.title, 
    GDOCProps.blocks.textBlock.text,
    GDOCProps.blocks.textBlock.tagName,
    )

// or .select( "title", "blocks.textBlock.text", "blocks.textBlock.tagName" )


Weaviate also has a few limitations you should be aware of:

1. Weaviate doesn't fully support JSON variables and fragments, so they should not be used in the client.

2. Currently, Weaviate doesn't support JSON data, hence we store them as serialized strings, and fields with this type of data returned by Unbody API are not transformed.

Here's an example of how we addressed this with the Apollo client:
```ts
 new ApolloClient({
  uri: 'https://graphql.unbody.io',
  resolvers: {
    GoogleDoc: {
      mentionsObj(rootValue, args, context, info) {
        return JSON.parse(rootValue.mentions || '[]')
      },
      tocObj(rootValue, args, context, info) {
        return JSON.parse(rootValue.toc || '[]')
      },
    },
    TextBlock: {
      footnotesObj(rootValue, args, context, info) {
        return JSON.parse(rootValue.footnotes || '[]')
      },
    },
  }
})

We’re looking for a way to transform JSON data by default, while also allowing developers to implement their own logic of transformation. The example with Apollo client keeps the original data (e.g., mentions) and adds additional fields (e.g., mentionsObj), which is not exactly what we’re looking for. We’d like to hear your ideas on how to address this issue.

miladazhdehnia commented 1 year ago

@jeangovil, I noticed that the proposed interface for the project does not include unlimited nested "where" clauses. As a result, it may be difficult to create a clean and readable syntax without importing operators. However, I believe that using callbacks inside a "where" clause can help us achieve maximum flexibility. Here's an example:

unbody.googleDoc.get
    .where(({AND, OR, IN, BETWEEN, etc...}) => {
        return AND(OR('podcasts', {path: 'path', value: 'Articles'}), OR('published', 'highlighted'))
    })

This syntax allows us to have unlimited nested queries while keeping it readable and easy to parse without any faults. Additionally, this approach empowers us with predefined types and can be shipped to clients with fully supported auto-completion features.

and for parsing issues:

unbody.googleDoc({
    toJSON: (data) => {/* user defined json serializer */}
})
jeangovil commented 1 year ago

@miladazhdehnia, sorry for replying late. I like the idea of injecting operators, but I don't quite understand how the client could possibly interpret OR('published', 'highlighted'), I assume the complete query would be like this?

unbody.googleDoc.get.where(({ AND, OR }) =>
  AND(
    OR(
      { path: "path", value: "articles" },
      { path: "path", value: "podcasts" }
    ),
    OR(
      { path: "path", value: "published" },
      { path: "path", value: "highlighted" }
    )
  )
);

But, personally, I liked the original proposal where you could pass filters as an object similar to the schema better:

unbody.googleDoc.get.where(({ AND, OR }) =>
  AND(
    OR({ path: "Articles" }, { path: "Podcasts" }),
    OR({ path: "published" }, { path: "highlighted" })
  )
);

// Cross-references:
unbody.textBlock.get.where(({ AND, OR }) =>
  AND(
    OR(
      {
        document: {
          GoogleDoc: { path: "Articles" },
        },
      },
      {
        document: {
          GoogleDoc: { path: "Podcasts" },
        },
      }
    ),
    OR(
      {
        document: {
          GoogleDoc: { path: "published" },
        },
      },
      {
        document: {
          GoogleDoc: { path: "highlighted" },
        },
      }
    )
  )
);

Regarding the parsing issue, would it be possible to define transformers per field in a similar way to the Apollo client?

const unbody = new Unbody(UNBODY_API_KEY, UNBODY_LPE_PROJECT_ID, {
  transformers: {
    GoogleDoc: {
      mentions(root, data = '[]') {
        return JSON.parse(data.mentions);
      },
      toc(root, data = '[]') {
        return JSON.parse(data.toc);
      },
    },
    TextBlock: {
      footnotes(root, data = '[]') {
        return JSON.parse(footnotes);
      },
    },
  },
});
miladazhdehnia commented 1 year ago

@jeangovil

I like the idea of injecting operators, but I don't quite understand how the client could possibly interpret OR('published', 'highlighted')

I'm not entirely certain, but I believe there may be a default path that allows the client to simply pass values.

For cross-references, we can introduce a level of abstraction for the query:

unbody.textBlock.get.where(({AND, OR}, {GoogleDoc}) =>
    AND(
        OR(GoogleDoc({path: "Articles"}), GoogleDoc({path: "Podcasts"})),
        OR(GoogleDoc({path: "published"}), GoogleDoc({path: "highlighted"})
        ),
    )
);


Regarding the parsing issue, would it be possible to define transformers per field in a similar way to the Apollo client?

We can define transformers per field in a similar way to the Apollo client if it better fits our requirements. Let me know your thoughts on this.

jeangovil commented 1 year ago
unbody.textBlock.get.where(({AND, OR}, {GoogleDoc}) =>
    AND(
        OR(GoogleDoc({path: "Articles"}), GoogleDoc({path: "Podcasts"})),
        OR(GoogleDoc({path: "published"}), GoogleDoc({path: "highlighted"})
        ),
    )
);

I’m not sure about this, given that each object could potentially have different fields of the same type. And, since we're introducing an interface similar to ORMs, I think it's best to stick to what's usual.

We can define transformers per field in a similar way to the Apollo client if it better fits our requirements. Let me know your thoughts on this.

I think it would be great to have it like that.

miladazhdehnia commented 1 year ago

It seems like we're ignoring the abstraction by writing the query as { document: { GoogleDoc: { path: "published" } } }. This is similar to writing a raw query. Is there a way we can improve this to better utilize the abstraction?

This is what comes in my mind:

unbody.textBlock.get.where(({AND, OR}) =>
    AND(
        OR(
            {
                document: 'GoogleDoc',
                path: "Articles"
            },
            {
                document: 'GoogleDoc',
                path: "Podcasts"
            },
        )
    )
);
jeangovil commented 1 year ago

I see, I think the problem with the previous example is that the document path is not specified. But then again I think the last example could be a bit confusing, passing different fields in the same level:

unbody.textBlock.get.where(({AND, OR}) =>
    AND(
        OR(
            {
                document: 'GoogleDoc',
                path: "Articles" // GoogleDoc field
            },
            {
                tags: "tag_1" // TextBlock field
            },
        )
    )
);

How about a combination of both approaches?

unbody.textBlock.get.where(({ AND, OR }, { GoogleDoc }) =>
  AND(
    OR(
      {
        document: GoogleDoc({
          path: "Articles",
        }),
      },
      {
        document: GoogleDoc({
          path: "Podcasts",
        }),
      }
    )
  )
)
miladazhdehnia commented 1 year ago

How about a combination of both approaches?

It seems good to me.

Is there any other issue that we should discuss about?

jeangovil commented 1 year ago

Perfect! No, will let you know if I can think of anything else.

miladazhdehnia commented 11 months ago

@jeangovil It will help me a lot if you give me some examples for what you have in mind for the aggregate query

jeangovil commented 11 months ago

@jeangovil It will help me a lot if you give me some examples for what you have in mind for the aggregate query

The aggregate method should be very similar to the Get query with a few exceptions:

Here are a few examples:

  1. Without using .select(), it should return all properties by default:
unbody
.aggregate
.googleDoc
.where({}) // Similar to the 'where' method in Get
  1. With .select():
unbody
.aggregate
.googleDoc
.select("title.type", "title.count", "title.topOccurrences.value", "title.topOccurrences.value")
  1. With .where():
unbody
.aggregate
.googleDoc
.where({ // The 'where' method uses the same type as the 'where' method in Get
  path: "Articles"
})
  1. Using .nearText():
unbody
.aggregate
.googleDoc
.nearText(["test"], 0.9) // Provide concepts and specify certainty (required)
  1. With .ask():
unbody
.aggregate
.googleDoc
.ask("question", 10, ["text"]) // Include the question, objectLimit (required), and optional properties