Better handling of timeout error from GitHub with an invalid response body

autarch commented 6 months ago

For a while I was trying to figure out why sometimes I'd get an empty response body for GitHub GraphQL requests. Eventually I ended up replacing the post_graphql function with my own version that dumped more info from the response. I found out that GitHub sometimes responds with a JSON body like this:

{"message": "We couldn't respond to your request in time. Sorry about that. Please try resubmitting your request and contact us if the problem persists."}

The current implementation of the graphql-client code makes this entirely invisible to the user of the library. I'm really not sure what the fix is here. I'm pretty sure that GitHub's response is just wrong. They should return a document with errors set to something. But they don't.

It would be really nice if this library would handle this case. I think ideally it'd return some sort of "invalid response format" error that contains the original body. As it stands, it seems like it just tries to parse the response and ends up creating an empty graphql_client::Response struct, which is really confusing.

autarch commented 6 months ago

I reported this to GitHub as well, since what they're doing is simply wrong. https://github.com/orgs/community/discussions/78614

TieWay59 commented 5 months ago

I have a similar itch in debug&guard post_graphql because I need to check the response headers, so I rewrite the original function.

FYI: In reqwest:: crate response.json() would consume the response instance in this context so you have to clone other data before returning the body.

pub fn post_graphql_blocking<Q: GraphQLQuery, U: reqwest::IntoUrl + Clone>(
    client: &reqwest::blocking::Client,
    url: U,
    variables: Q::Variables,
    mut f: impl FnMut(&reqwest::header::HeaderMap) -> anyhow::Result<()>,
    // ^^ what I add 
) -> Result<graphql_client::Response<Q::ResponseData>, reqwest::Error>
where
    Q::Variables: Window,
{
    // original impl

    // take response headers out
    let _ = f(response.headers());

    response.json()
}

// usage:
    let response = graphql_client_ext::post_graphql_blocking::<GetAnsweredDiscussions, _>(
        client,
        "https://api.github.com/graphql",
        variables,
        |h| {
            rate_limit = h.try_into().unwrap_or_default();
            Ok(())
        },
    )
    .expect("failed to execute query");

Hello @tomhoule, do you have any better idea on designing this?

I think if you are reusing the fn name from reqwest, it's better to provide similar behavior right? The better design is to break the current graphql_client::reqwest::post_graphql_blocking to return reqwest::Response (instead of graphql_client::Response<Q::ResponseData>). There should be another function to take Response converted to graphql_client::Response<Q::ResponseData>. So users can observe headers in the middle.

Or maybe would it be easier to add .status() & .headers() to graphql_client::Response

pub struct Response<Data> {
    pub data: Option<Data>,
    pub errors: Option<Vec<Error, Global>>,
    pub extensions: Option<HashMap<String, Value, RandomState>>,
}

I know this would break a lot of cases, but seems no more good choices. I may miss some details and wish to hear more from you.

tomhoule commented 5 months ago

Hmm yes, good point. I'm not sure what the best way out of this could be. The whole point of the included client is that it returns a typed response, so there isn't much to gain if it returns something from reqwest directly anymore. My personal preference would be removing the client, but I'm not sure how much that would break people's workflows.

TieWay59 commented 5 months ago

@tomhoule

I agree, in this case, it's better to let users write their posting logic. I write a lot of guarding checking code in post_graphql_blocking directly after I rewrite it in my package.

I'm not sure how much that would break people's workflows.

Much, because using post_graphql_blocking is the most simple example in the first page docs. We'll need another leading example because the generated structs are not so obvious at first glance.

tomhoule commented 5 months ago

Is there a way to split the body from the headers in reqwest? We could return (graphql_body, headers) from the graphql_client wrapper. It wouldn't solve the issue of non-graphql-compliant bodies though, we would have to have an extra variant somewhere for bodies that aren't valid graphql responses.

TieWay59 commented 5 months ago

Is there a way to split the body from the headers in reqwest? We could return (graphql_body, headers) from the graphql_client wrapper. It wouldn't solve the issue of non-graphql-compliant bodies though, we would have to have an extra variant somewhere for bodies that aren't valid graphql responses.

@tomhoule

My solution is below, but it seems not the best:

fn get_headers_and_response_body_from_reqwest_result(
    reqwest_response: Result<reqwest::blocking::Response, reqwest::Error>,
) -> (reqwest::header::HeaderMap, String) {
    match reqwest_response {
        Ok(r) => {
            let header_map = r.headers().clone();
            let body = r.text().unwrap_or("respnse.text() failed".to_owned());
            (header_map, body)
        }
        Err(e) => {
            log::error!("reqwest_response is Err: {e:#?}");
            (
                reqwest::header::HeaderMap::new(),
                "reqwest_response is Err".to_owned(),
            )
        }
    }
}

autarch commented 5 months ago

It wouldn't solve the issue of non-graphql-compliant bodies though, we would have to have an extra variant somewhere for bodies that aren't valid graphql responses.

I think the right way to do this would be to have post_graphql return a Result, rather than just a tuple of headers & body. Then an invalid GraphQL response is just an Err.

If you wanted to keep the old funcs for back-compat I think the new ones could be named try_post_graphql and try_post_graphql_blocking.

I think you'd need to parse the response into a serde_json::Value, check that it's an Object, then check it's keys to make sure it has one of the two required keys and no extras, and then finally turn that into the Q::ResponseData type. I just looked and there's a serde_json::from_value fn that will do this conversion for you.

TieWay59 commented 5 months ago

Thank you @autarch, I've been in trouble lately, so I may need a few weeks to proceed with this issue. I agree with the try_post_graphql* naming. My solution above is too simple to align with the original proposal.

TieWay59 commented 3 months ago

@autarch I came back to propose a simple solution, wdut if we split the old post_graphql_blocking into 2 parts? This leads to adding a small number of functions (like three I guess.) and won't break the old design.

/// Use the provided reqwest::Client to post a GraphQL request.
pub fn try_post_graphql_blocking<Q: GraphQLQuery, U: reqwest::IntoUrl>(
    client: &reqwest::blocking::Client,
    url: U,
    variables: Q::Variables,
) -> Result<reqwest::blocking::Response, reqwest::Error> {
    let body = Q::build_query(variables);
    client.post(url).json(&body).send()
}

/// parse reqwest::blocking::Response into Result<crate::Response<Q::ResponseData>, reqwest::Error>
pub fn try_parse_response<Q: GraphQLQuery>(
    response: reqwest::blocking::Response,
) -> Result<crate::Response<Q::ResponseData>, reqwest::Error> {
    response.json::<crate::Response<Q::ResponseData>>()
}

example:

    // old examples:
    //
    // let response_body =
    //     post_graphql::<RepoView, _>(&client, "https://api.github.com/graphql", variables).unwrap();
    //
    // new one:
    //

    let try_response = graphql_client::reqwest::try_post_graphql_blocking::<RepoView, _>(
        &client,
        "https://api.github.com/graphql",
        variables,
    )?;

    dbg!(try_response.headers());
    dbg!(try_response.status());

    // try_response.text()?; <- this is consuming the response, so we can't use it anymore

    let response_body = graphql_client::reqwest::try_parse_response::<RepoView>(try_response)?;

graphql-rust / graphql-client

Better handling of timeout error from GitHub with an invalid response body #467