Closed ClementValot closed 1 year ago
@ClementValot can you provide a script/snippet that reproduces it?
It works via cURL:
curl -X POST -H "Content-Type: application/json" -d '{ "query": "query { competitions { id } } "}' https://live.worldcubeassociation.org/api
and also with fetch
:
fetch("https://live.worldcubeassociation.org/api", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query: "query { competitions { id } }" })
})
.then((response) => response.json())
.then((data) => console.log(data));
Sure, here's a js snippet with a query that consistently 502's:
import got from "got";
const competitionQuery = `
query Competition($id:ID!) {
competition(id:$id) {
name
competitionEvents {
id
event {
id
name
}
rounds {
active
id
name
number
open
results {
ranking
person {
id
name
wcaId
results {
advancing
attempts {
result
}
average
averageRecordTag
best
ranking
round {
name
number
competitionEvent {
event {
id
name
}
}
}
singleRecordTag
}
}
}
}
}
}
}
`;
const wcaLive = "https://live.worldcubeassociation.org";
const client = got.extend({
prefixUrl : wcaLive,
responseType: "json"
})
client.post("api", {json: {operationName: "Competition", query: competitionQuery, variables: {id: "2285"}}}).then((result)=> {console.log(result.body)})
Funny thing is, if I change operationName to something that isn't "Competition", the query goes through with 200 and correctly responds with an error, so it's not something that has to do with headers as I initially thought
@ClementValot I see, I think the issue is that the query is too complex and makes the server run out of memory. If you remove the nested results part it seems to work fine:
const competitionQuery = `
query Competition($id:ID!) {
competition(id:$id) {
name
competitionEvents {
id
event {
id
name
}
rounds {
active
id
name
number
open
results {
ranking
person {
id
name
wcaId
}
}
}
}
}
}
`;
That's an issue with GraphQL APIs, ideally we should add complexity analysis to prevent such queries to go through. For now please avoid so much levels of nested relations.
I added some basic rules in 535191fd2e8919629be3c3ac7f76520d8031a7c5 to prevent from too complex queries and now the query you posted returns a proper error.
It used to work properly before the infra changes :'(
Having to make several requests kinda defeats the purpose of GraphQL, doesn't it? 🤔
I'll rewrite, thanks for the help!
It used to work properly before the infra changes :'(
Interesting, the base instance has less memory now and the app scales horizontally. Though looking into logs it seems like the OOM is significant, so it may be related to runtime change (both OS and the language runtime), which would be surprising too. But either way, I think we should've been doing complexity analysis in the first place anyway, so thanks for opening the issue.
Having to make several requests kinda defeats the purpose of GraphQL, doesn't it? 🤔
Not necessarily, handling a request is one thing, but underneath it requires several database queries and memory allocation, so that needs to be capped too. Imagine a typical GraphQL API that returns a long list of entries, it would usually be paginated, so to get many entries you need to query multiple pages one by one. Similarly, allowing arbitrarily nested graphql queries may just be too resource-heavy, and that also allows a bad actor to easily crash the app.
Also note that technically speaking the main purpose of the API is for the WCA Live client itself, so as long as it can successfully make its queries it works as expected. While it is allowed for other clients to call the API, it's not optimised for such purpose and it's not expected to be relied on heavily. For most cases we people should use WCIF instead, and query the WCA Live API only if they actually need access to results as soon as entered (rather than synchronized).
It is for live commentary purposes, I print cheat sheets about every finalist once per competition as soon as the finals round is open, and tried to optimize by minimizing the number of queries, but I'll clean that up.
Thank you!
@ClementValot actually I think you can still do it with a single query. Instead of querying for competitor data on every result (which leads to a bunch of duplicate objects), you could query for just competitor ids and extend the query with competitor data:
query Competition($id:ID!) {
competition(id:$id) {
name
competitionEvents {
id
event {
id
name
}
rounds {
active
id
name
number
open
results {
ranking
person {
id
}
}
}
}
competitors {
id
name
wcaId
results {
advancing
attempts {
result
}
average
averageRecordTag
best
ranking
round {
name
number
competitionEvent {
event {
id
name
}
}
}
singleRecordTag
}
}
}
}
Then you lookup competitor info in that list :)
I've got a piece of software that uses got to query the graphQL api, that worked until recently when it started to receive 502 error codes
Here are the logged options of the request :
Thank you for your help