Closed ThisaruGuruge closed 1 day ago
@ThisaruGuruge Just to clarify. If we have a query like:
friends(first: 5) {
id
name
}
which limits the number of items returned using an input, or a query like:
friends(ids: [1, 2, 3]) {
id
name
}
which takes an array of arguments, we execute the sub-fields' resolvers for each object. In such cases, don't we need to multiply the complexity by the number of returned items, or should we let the user define the complexity according to the number of returned items?
In such cases, don't we need to multiply the complexity by the number of returned items, or should we let the user define the complexity according to the number of returned items?
Good point! In the case you mentioned, we cannot multiply the complexity of the query by multiplying the number by the result set, since the number of returned elements depends on the execution and we need to calculate the complexity before the execution. So this should be handled by the user and this aspect should be considered when assigning complexity values. I will add this to the description.
Summary
This proposal aims to introduce a Query Complexity Analysis feature to the Ballerina GraphQL module. This feature will evaluate the complexity of incoming GraphQL queries and help prevent performance and security issues caused by overly complex queries.
Goals
Non-Goals
Motivation
GraphQL allows users to query data in a flexible and efficient way. However, this flexibility can be abused by malicious users or result in performance issues due to overly complex queries resulting in Denial of Service (DoS) attacks or high server load. By introducing a complexity analysis feature, Ballerina can help users identify and prevent such issues. This will enhance the user experience and security of Ballerina applications that use GraphQL.
Description
Definition
The query complexity of a GraphQL operation can be calculated based on the complexity of its fields. The complexity of a field can be defined by the user based on the field’s type and the amount of data it retrieves. The complexity of a query is the sum of the complexities of its fields. Users can set a maximum complexity threshold for queries, and queries exceeding this threshold can be either rejected by throwing an error or logged as a warning as per the user’s configuration.
Proposed Design
At the service level, the users can define the maximum query complexity allowed, the default complexity of a field, and whether to log a warning or throw an error when the complexity threshold is exceeded. These configurations are introduced as a separate field named
queryComplexityConfig
in thegraphql:ServiceConfig
annotation. TheQueryComplexityConfig
is an optional field, and each field inside theQueryComplexityConfig
record is required, but has default values. Following is the definition of theQueryComplexityConfig
record:The
QueryComplexityConfig
RecordThe
QueryComplexityConfig
record contains the following fields:maxComplexity
: The maximum allowed complexity for a query. The default value is 100.defaultFieldComplexity
: The default complexity value for a field. The default value is 1. This will be applied to each field on the query, unless a it is overridden by a custom complexity value on the field.warnOnly
: A boolean value indicating whether to log a warning or throw an error when the complexity threshold is exceeded. The default value isfalse
, which means an error will be thrown.Field Complexity
A new field
complexity
will be introduced to thegraphql:ResourceConfig
annotation. This field allows users to define the complexity of a field. Thecomplexity
field is an optional field, and if not provided, the default complexity value defined in theQueryComplexityConfig
will be used. Following is the updated definition of theGraphqlResourceConfig
record:Record Field Complexity
This proposal does not intend to introduce custom complexity values for record fields. The record field complexity will be default complexity value defined in the
QueryComplexityConfig
. This is because the complexity of a record field is directly related to the complexity of the record itself. This can be revisited in future enhancements, if necessary.Complexity Calculation
When the GraphQL schema is created from the Ballerina service, the complexity of each field will be added to the generated schema.
When calculating the complexity of a query, only the operation intended to execute will be considered. All the other operations will be ignored. The complexity will be accumulated per each field and the final complexity will be the sum of all the field complexities.
Query Complexity Threshold
After the query complexity is calculated for a particular operation, the GraphQL engine will check the complexity threshold defined in the
QueryComplexityConfig
. If the calculated complexity exceeds the threshold, either of the following two actions will be taken based on thewarnOnly
field:When
warnOnly: false
An error will be thrown without executing the query. The corresponding HTTP status code will be 400. The error message will be in the following format:When
warnOnly: true
A warning will be logged without executing the query. The warning message will be in the following format:Examples
Following is an example GraphQL service with query complexity analysis enabled:
Following are some example queries, their calculated complexities, and the expected responses:
Query with complexity below the threshold:
GraphQL Document:
Calculated Complexity: 1
Expected Response:
Query with complexity exceeding the threshold:
GraphQL Document:
Calculated Complexity: 28
Expected Response:
Document with multiple operations:
GraphQL Document:
Execute the
GetGreeting
operation.Calculated Complexity: 1
Expected Response:
Alternatives
Bring all the Analysis into Single Configuration
In GraphQL there are some additional query analysis that can be done in parallel with the complexity analysis, such as
maxHeight
,maxAliases
, andmaxRootFields
, in addition to the existing validation ofmaxQueryDepth
. These can be combined into a single configuration, such asQueryAnalysisConfig
. Following is an example of such a configuration:The
Height
,Aliases
, andRootFields
configurations are not considered in this proposal since they are not directly related to the complexity analysis. Comparatively, these configurations are less likely to be used by users. We can consider adding these configurations in future enhancements, if necessary.Combining the
maxQueryDepth
andqueryComplexityConfig
is not considered in this proposal since it will be a breaking change.