Open miladazhdehnia opened 1 year ago
@amirhouieh Since we have Weaviate as our source of operations, we should clarify the operations we need. It would be helpful to provide a list of operation names if possible.
@miladazhdehnia, regarding the proposed interface, it's not clear how nested where
operators could be constructed with chained methods. Could you please provide an example that would yield the following GraphQL query?
query {
Get {
TextBlock(
where: {
operator: And
operands: [
{
operator: Or
operands: [
{
path: ["document", "GoogleDoc", "path"]
valueString: "Podcasts"
operator: Equal
}
{
path: ["document", "GoogleDoc", "path"]
valueString: "Articles"
operator: Equal
}
]
}
{
operator: Or
operands: [
{
path: ["document", "GoogleDoc", "path"]
valueString: "published"
operator: Equal
}
{
path: ["document", "GoogleDoc", "path"]
valueString: "highlighted"
operator: Equal
}
]
}
]
}
) {
text
}
}
}
To answer your question regarding operators, the client should support all Weaviate's operations for both Get, and Aggregate queries for all Unbody objects(e.g., GoogleDoc, TextBlock, etc.) You can find the list of operations in Unbody's GraphQL schema. For example, the following methods should be supported for the Get query:
where()
ask()
bm25()
hybrid()
nearText()
nearVector()
nearObject()
group()
groupBy()
sort()
offset()
limit()
Some methods, such as ask
, require an _additional
field in the GraphQL query to get the answer. To make it easier for developers, we should add a default _additional
field to the query in the client. However, we should also provide an .additional
method in case developers need to override the default behavior.
Example:
This method should yield the following GraphQL query:
unbody.googleDoc.get.ask("What is Unbody's core feature?", "text")
query {
Get {
GoogleDoc(
ask:{
question: "What is Unbody's core feature?"
properties: "text"
}
){
_additional{
answer{
result
}
}
}
}
}
ask
should have multiple types of arguments for different methods.Example:
Developer should be able to construct ask in two different ways
unbody.gdoc.get.ask("What is Unbody's core feature?", "text")
// or
unbody.gdoc.get.ask({
question: "What is Unbody's core feature?"
properties: "text"
})
select
method should support nested objects. Here is the suggested syntax:
.select(
GDOCProps.title,
GDOCProps.blocks.textBlock.text,
GDOCProps.blocks.textBlock.tagName,
)
// or .select( "title", "blocks.textBlock.text", "blocks.textBlock.tagName" )
Weaviate also has a few limitations you should be aware of:
1. Weaviate doesn't fully support JSON variables and fragments, so they should not be used in the client.
2. Currently, Weaviate doesn't support JSON data, hence we store them as serialized strings, and fields with this type of data returned by Unbody API are not transformed.
Here's an example of how we addressed this with the Apollo client:
```ts
new ApolloClient({
uri: 'https://graphql.unbody.io',
resolvers: {
GoogleDoc: {
mentionsObj(rootValue, args, context, info) {
return JSON.parse(rootValue.mentions || '[]')
},
tocObj(rootValue, args, context, info) {
return JSON.parse(rootValue.toc || '[]')
},
},
TextBlock: {
footnotesObj(rootValue, args, context, info) {
return JSON.parse(rootValue.footnotes || '[]')
},
},
}
})
We’re looking for a way to transform JSON data by default, while also allowing developers to implement their own logic of transformation. The example with Apollo client keeps the original data (e.g., mentions
) and adds additional fields (e.g., mentionsObj
), which is not exactly what we’re looking for. We’d like to hear your ideas on how to address this issue.
@jeangovil, I noticed that the proposed interface for the project does not include unlimited nested "where" clauses. As a result, it may be difficult to create a clean and readable syntax without importing operators. However, I believe that using callbacks inside a "where" clause can help us achieve maximum flexibility. Here's an example:
unbody.googleDoc.get
.where(({AND, OR, IN, BETWEEN, etc...}) => {
return AND(OR('podcasts', {path: 'path', value: 'Articles'}), OR('published', 'highlighted'))
})
This syntax allows us to have unlimited nested queries while keeping it readable and easy to parse without any faults. Additionally, this approach empowers us with predefined types and can be shipped to clients with fully supported auto-completion features.
and for parsing issues:
unbody.googleDoc({
toJSON: (data) => {/* user defined json serializer */}
})
@miladazhdehnia, sorry for replying late.
I like the idea of injecting operators, but I don't quite understand how the client could possibly interpret OR('published', 'highlighted')
, I assume the complete query would be like this?
unbody.googleDoc.get.where(({ AND, OR }) =>
AND(
OR(
{ path: "path", value: "articles" },
{ path: "path", value: "podcasts" }
),
OR(
{ path: "path", value: "published" },
{ path: "path", value: "highlighted" }
)
)
);
But, personally, I liked the original proposal where you could pass filters as an object similar to the schema better:
unbody.googleDoc.get.where(({ AND, OR }) =>
AND(
OR({ path: "Articles" }, { path: "Podcasts" }),
OR({ path: "published" }, { path: "highlighted" })
)
);
// Cross-references:
unbody.textBlock.get.where(({ AND, OR }) =>
AND(
OR(
{
document: {
GoogleDoc: { path: "Articles" },
},
},
{
document: {
GoogleDoc: { path: "Podcasts" },
},
}
),
OR(
{
document: {
GoogleDoc: { path: "published" },
},
},
{
document: {
GoogleDoc: { path: "highlighted" },
},
}
)
)
);
Regarding the parsing issue, would it be possible to define transformers per field in a similar way to the Apollo client?
const unbody = new Unbody(UNBODY_API_KEY, UNBODY_LPE_PROJECT_ID, {
transformers: {
GoogleDoc: {
mentions(root, data = '[]') {
return JSON.parse(data.mentions);
},
toc(root, data = '[]') {
return JSON.parse(data.toc);
},
},
TextBlock: {
footnotes(root, data = '[]') {
return JSON.parse(footnotes);
},
},
},
});
@jeangovil
I like the idea of injecting operators, but I don't quite understand how the client could possibly interpret OR('published', 'highlighted')
I'm not entirely certain, but I believe there may be a default path that allows the client to simply pass values.
For cross-references, we can introduce a level of abstraction for the query:
unbody.textBlock.get.where(({AND, OR}, {GoogleDoc}) =>
AND(
OR(GoogleDoc({path: "Articles"}), GoogleDoc({path: "Podcasts"})),
OR(GoogleDoc({path: "published"}), GoogleDoc({path: "highlighted"})
),
)
);
Regarding the parsing issue, would it be possible to define transformers per field in a similar way to the Apollo client?
We can define transformers per field in a similar way to the Apollo client if it better fits our requirements. Let me know your thoughts on this.
unbody.textBlock.get.where(({AND, OR}, {GoogleDoc}) => AND( OR(GoogleDoc({path: "Articles"}), GoogleDoc({path: "Podcasts"})), OR(GoogleDoc({path: "published"}), GoogleDoc({path: "highlighted"}) ), ) );
I’m not sure about this, given that each object could potentially have different fields of the same type. And, since we're introducing an interface similar to ORMs, I think it's best to stick to what's usual.
We can define transformers per field in a similar way to the Apollo client if it better fits our requirements. Let me know your thoughts on this.
I think it would be great to have it like that.
It seems like we're ignoring the abstraction by writing the query as { document: { GoogleDoc: { path: "published" } } }
. This is similar to writing a raw query. Is there a way we can improve this to better utilize the abstraction?
This is what comes in my mind:
unbody.textBlock.get.where(({AND, OR}) =>
AND(
OR(
{
document: 'GoogleDoc',
path: "Articles"
},
{
document: 'GoogleDoc',
path: "Podcasts"
},
)
)
);
I see, I think the problem with the previous example is that the document
path is not specified. But then again I think the last example could be a bit confusing, passing different fields in the same level:
unbody.textBlock.get.where(({AND, OR}) =>
AND(
OR(
{
document: 'GoogleDoc',
path: "Articles" // GoogleDoc field
},
{
tags: "tag_1" // TextBlock field
},
)
)
);
How about a combination of both approaches?
unbody.textBlock.get.where(({ AND, OR }, { GoogleDoc }) =>
AND(
OR(
{
document: GoogleDoc({
path: "Articles",
}),
},
{
document: GoogleDoc({
path: "Podcasts",
}),
}
)
)
)
How about a combination of both approaches?
It seems good to me.
Is there any other issue that we should discuss about?
Perfect! No, will let you know if I can think of anything else.
@jeangovil It will help me a lot if you give me some examples for what you have in mind for the aggregate query
@jeangovil It will help me a lot if you give me some examples for what you have in mind for the aggregate query
The aggregate method should be very similar to the Get query with a few exceptions:
_additional
property, but instead, there's an extra meta { count }
property; however, there's no need for a .meta()
method, the meta.count
must be selectable using the .select()
method.ask
and nearText
are used, the aggregate operation is performed on a limited subset of the top results. The restriction is determined either by the certainty
or the objectLimit
properties.Here are a few examples:
.select()
, it should return all properties by default:unbody
.aggregate
.googleDoc
.where({}) // Similar to the 'where' method in Get
.select()
:unbody
.aggregate
.googleDoc
.select("title.type", "title.count", "title.topOccurrences.value", "title.topOccurrences.value")
.where()
:unbody
.aggregate
.googleDoc
.where({ // The 'where' method uses the same type as the 'where' method in Get
path: "Articles"
})
.nearText()
:unbody
.aggregate
.googleDoc
.nearText(["test"], 0.9) // Provide concepts and specify certainty (required)
.ask()
:unbody
.aggregate
.googleDoc
.ask("question", 10, ["text"]) // Include the question, objectLimit (required), and optional properties
Description:
We need to develop a TypeScript client implementation for Unbody's GraphQL API version 1. This implementation will allow us to communicate effectively with the API, leveraging TypeScript's type-checking and autocompletion features.
Goal:
The goal of this issue is to create a robust TypeScript client library that provides a convenient and intuitive interface for developers to interact with Unbody's GraphQL API. This library should support all operations offered by the API.
Implementation Steps:
An example of expected result: