Update typings for functions that wrap `request()` to return `Response` when using `rawResponse`

thw0rted commented 3 years ago

As I commented on #462, the current typings for e.g. queryFeatures say that the method returns Promise<IQueryFeaturesResponse | IQueryResponse>. The method actually returns a Promise<Response> when the rawResponse option is true.

It would be nice if the method could return the correct promise type directly based on the shape of the options object passed, either via implied generics or using a (granted, pretty verbose) overload. Overloading would be verbose but I don't think especially difficult -- see this Angular class for an example. Much like this library, the Angular HTTPClient has a request method that changes its return type based on an options-object, and multiple helpers that call request and also vary the return type based on arguments.

patrickarlt commented 3 years ago

It would be nice if the method could return the correct promise type directly based on the shape of the options object passed, either via implied generics or using a (granted, pretty verbose) overload

I went down a rabbit hole on this. It can be done but is as stated REALLY verbose. You can see the TypeScript Playground sample I made to reflect this.

I don't think it should be terribly hard to do this the main question is one of scale. Most methods in the API support multiple formats. For example the /addItem endpoint supports html, json and pjson format options, Others like geocoding ONLY support json. Does anyone REALLY care about html? Supporting only the json format (with an optional rawResponse) would reduce the scale down to JUST what the endpoints support.

Some endpoints like /FeatureService/{LAYER_ID}/query support LOTS of formats (html, json, geojson, pbf) + rawResponse which would map to several overloads:

f=html returns Promise<string>
f=json = Promise<CustomResponseObject>
f=geojson = Promise<GeoJSON>
f=pbf = Promise<Blob>
rawResponse: true = Promise<Response>

Which would more accurately reflect the individual response types of each request. The added benefit of this is that Typescript aware IDEs like VS Code will automatically select the correct type based on the input params. This is however a TON of work to go through every method and:

Restrict which formats are allowed for that method
Add an override for each format
Add an override for raw responses
Move the type declaration for the response to the generic request<ResponseType>(requestOptions)
Write a seperate request call for each format case so typescript can automatically infer types.

For most methods this would be json and html if we decide to drop html support that would be a breaking change but one less case EVERYWHERE to handle but this is something we should discuss.

thw0rted commented 3 years ago

Your linked example doesn't seem too verbose to me, especially for something you can probably write once and more or less forget about. I also think omitting support for html makes sense, if it creates a lot of extra work/overhead.

As far as going through every method, it wouldn't hurt to improve one or two methods at a time. I don't think it should be a breaking change to return a more-specific type, except in the case where somebody casts the result to a definitely-wrong type and somehow got it working anyway.

(FYI "ResponseType" is a built-in, I think an enum used by either XHR or Fetch. I've also gotten bitten by using it as a generic parameter name... definitely best to avoid overlap.)

patrickarlt commented 3 years ago

@gavinr @tomwayson and I discussed this in an internal meeting. We all agreed that this is pretty important for some methods (queryFeatures, request) but really pointless for most methods createItem, geocode.

There are 2 main downsides to this:

In order to do this each call to request has to be wrapped in an if statement for each format in order for TypeScript to trace the types properly. Istanbul will treat each if statement as something we have to write tests for to get test coverage. Which really isn't needed.

It really is a ton of work when you consider all 100+ methods in Rest JS. Here a quick before/after:

/**
* Before
*/
function create(options: CreateOptions): Promise<CreateResponse> {
  const baseRequestOptions = {
    url: "..."
  }

  return request<CreateResponse>({
    ...baseRequestOptions, format: "json"
  });
}

/**
* After
*/
// adding overrides are fairly easy...
function create(options: CreateOptions & { format?: "json" , raw?: false}): Promise<CreateResponse>;
function create(options: CreateOptions & { format?: "html", raw?: false }): Promise<string>;
function create(options: CreateOptions & { raw?: true , format?: "json" | "html"}): Promise<Response>;
function create(options: Partial<CreateOptions> = {format: "json", raw: false}): Promise<Response> | Promise<string> | Promise<CreateResponse> {
  const baseRequestOptions = {
    url: "..."
  }

  // conditionalizing all possible returns types feels really silly.
  if (options.raw) {
    return request({
      ...baseRequestOptions, raw: true, format: options.format
    });
  }

  if (options.format === "html") {
    return request({
      ...baseRequestOptions, format: "html"
    });
  }

  return request<CreateResponse>({
    ...baseRequestOptions, format: "json"
  });
}

I want to sit on this for a bit. I think there definitely needs to be some trimming down of the API surface area. For example:

The /addItem endpoint only supports json, pjson and html as valid f params. My preference is to actually drop support for both rawResponse and force f=json. I'm having a hard time justifying use cases where f=html or f=pjson would really be valid OR where I would need to stream a tiny JSON response. This would really reduce the surface area of the API and make most of these cases obsolete.
Do we need rawResponse anymore? The main motivation for rawResponse comes from https://github.com/Esri/arcgis-rest-js/issues/373 when our recommended fetch implementation didn't support response.blob(). With 4.0 we will ship the latest version of node-fetch which DOES support response.blob(). The main motivation for rawResponse now seems to be streaming responses, which is useful for working with large responses like huge queries or files OR for parsing formats we don't support in REST JS for whatever reason.

I'm leaning towards the following:

We should drop support for f=html everywhere unless the endpoint REALLY returns a valid HTML page that developers would want to use. In its current state f=html is really only used for the forms that exist on the server itself. I don't think there is anywhere in REST JS currently where f=html makes any sense for someone to want to use so we should drop it.
Remove rawResponse. Instead REST JS should intelligently handle things like getItemData(), getItemResource() and the missing getAttachment() methods or any other methods that return files/blobs by doing the following:
1. Make a request with f=json first. This will return JSON data immediately for things like web maps. If you do not pass f=json you get an HTML error by default so this request should be done first to check for errors.
2. If the item data/resource is NOT JSON then you get a Specified output format not supported. and you can retry the request without f=json and return a Response object that the user can decide how they want to handle with response.blob() or steam response.body.
3. The response types and doc for methods that return files should be used to point out that this will return an object for JSON data and a Response object for everything else and users should either handle both cases OR check the type of item up front so they know how to parse it.
For methods like queryFeatures that support additional response types like geojson or pbf we can write intelligent overrides like I described above. I think queryFeatures might be the only one at this point. queryFeatures also has lots of other special cases like returnCentroids which would be good to handle as well.
We should update request like I described above because that is much better for TypeScript.

This REALLY minimizes the surface area of the API to the point where the original issue simply disappears. The only thing we really loose is the ability to stream any request which for 99.9% of requests really doesn't make sense anyway. Streaming responses also bypasses all error checking authentication management anyway.

If we REALLY feel like specific methods need the option for streaming responses then we could add a stream option in request that could optionally be reflected in parent methods with the caveat that it turns off most of fancy REST JS features.

thw0rted commented 3 years ago

One: you shouldn't need any conditionals because you're calling the same underlying JS. If you're writing runtime code to convince the typechecker that you're right, it's probably an appropriate time to whack an as any in there and call it a day. The main point of the overrides is not to check that your code (create calling request) is correct, it's to give the right return type to the library consumer, for the given option argument.

Two: I believe the original thing that brought me here was trying to add pbf support for a part of my project where I'm using queryFeatures. The main part of the library I use now is queryFeatures so I'm not as familiar with the other bits. I would like to have the option of streaming pbf but that hasn't made much progress lately anyway. Still, it'd be nice to keep streams in mind, specifically for very large feature queries.

patrickarlt commented 3 years ago

One: you shouldn't need any conditionals because you're calling the same underlying JS

You would think this but you end up running into some interesting rules about literal interfaces. From the typescript doc:

function handleRequest(url: string, method: "GET" | "POST") : void;

const req = { url: "https://example.com", method: "GET" };
handleRequest(req.url, req.method);
// Argument of type 'string' is not assignable to parameter of type '"GET" | "POST"'.

https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#literal-inference

This is because the type of req.method could TECHNICALLY be any string, just because it HAPPENS to be "GET" is irrelevant. You can work around this with as const or as "GET" to manually override the type.

function handleRequest(url: string, method: "GET" | "POST") : void;

const req = { url: "https://example.com", method: "GET" as const};
handleRequest(req.url, req.method);
// Works
https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#literal-inference

So that seems to make this work:

function create(options: CreateOptions & { format?: "json" , raw?: false}): Promise<CreateResponse>;
function create(options: CreateOptions & { format?: "html", raw?: false }): Promise<string>;
function create(options: CreateOptions & { raw?: true , format?: "json" | "html"}): Promise<Response>;
function create(options: Partial<CreateOptions>= {format: "json", raw: false}): Promise<Response> | Promise<string> | Promise<CreateResponse> {
  const baseRequestOptions = Object.assign({
    url: "...",
    format: "json",
    raw: false 
  } as const, options);

  return request<CreateResponse>(baseRequestOptions);
}

thw0rted commented 3 years ago

I should have mentioned that before, as const is basically magic for these situations -- I'm glad you found it yourself.

patrickarlt commented 3 years ago

I did however uncover another error which I cannot seem to work around which is shown in this playground example. Using an external object for options with format anything other then json always seems to fail with // Types of property 'format' are incompatible. Type '"html"' is not assignable to type '"json" | undefined'. which seems to break the overrides. I think this MIGHT be a TypeScript bug/limitation because in-lining the same object magically works.

2021-09-21_08-12-37 (1)

function request<ResponseType>(options: RequestOptions & { format?: "json" , raw?: false}): Promise<ResponseType>;
function request(options: RequestOptions & { format?: "html", raw?: false }): Promise<string>;
function request(options: RequestOptions & { raw?: true , format?: "json" | "html"}): Promise<Response>;
function request<ResponseType>(options: Partial<RequestOptions>= {format: "json", raw: false}): Promise<Response> | Promise<string> | Promise<ResponseType> | Promise<string | Response | ResponseType> {  
  if(!options.url) {
    throw "missing url"
  }

  return fetch(options.url).then(response => {
    if (options.raw) {
      return Promise.resolve(response);
    }

    if (options.format === "html") {
      return response.text()
    }

    return response.json()
  })
}

// Works
const requestText = request({ url: "...", format: "html"});

//  Types of property 'format' are incompatible.
//      Type '"html"' is not assignable to type '"json" | undefined'.
const options2 = { url: "...", format: "html" as const};
const requestText1 = request<{foo: string}>(options2);

//  Types of property 'format' are incompatible.
//      Type '"html"' is not assignable to type '"json" | undefined'.
const options3 = { url: "...", format: "html"} as const;
const requestText2 = request<{foo: string}>(options3);

Which I cannot seem to find any workaround for.

thw0rted commented 3 years ago

I think part of the problem might be the way you're using generics alongside the override. Notice that you're not using the generic parameter anywhere in the code itself, only in the return type. What you're actually doing is a fancy typecast -- you're letting the user write request<MyObject>(opts) instead of request(opts) as MyObject, but it means the exact same thing. If you take the generics out and simply return unknown as the type of the decoded JSON object, you'll need to check or cast the result, but that cast is exactly as type-safe (i.e, unsafe) as it was to start with.

Replacing the generic with unknown resolves the one outstanding error in your Playground example. Remember to remove all references to ResponseType -- they won't be flagged when you remove the generic because the generic parameter shadowed out the built in type with that name.

patrickarlt commented 3 years ago

OK figured it out this TypeScript playground has the final POC, all the overload signatures in request needed the generics in front. This example also returns Promise<any> if you don't pass a generic to request

thw0rted commented 3 years ago

I noticed that you have raw?: false in one line and raw?: true in another, so the type returned by request({type: "html"}) is determined by the ordering of the overloads, rather than the actual argument. I think you probably want raw: true, i.e., undefined means false.

patrickarlt commented 3 years ago

Final POC in the playground. This adds a Formats enum which TypeScript users can use to get around as const and cleans up raw? : true.

Even though this works I don't think this changes any of my opinions in https://github.com/Esri/arcgis-rest-js/issues/821#issuecomment-923204881. There isn't anything to be gained by supporting all these possible output methods and formats for all these methods or adding a ton of complexity where it isn't warrented. There are a precious few methods (queryFeatures with f=pbf or f=geojson, getItemData() with f=zip, ect...) where the f param should be anything other then json. There are also only a few methods like getItemData(), queryFeatures() where the response size MIGHT be large enough to warrant bypassing auth and error checks and streaming the response.

patrickarlt commented 3 years ago

I'm still on the fence if we really need to allow streaming on queryFeatures. Conceptually I feel we should allow it because these responses are large and the option to stream them would be nice. Practically there isn't any way stream protobuf output as @thw0rted mentioned, and options for streaming JSON arrays appear really limited and not well maintained on NPM.

In the end I want to see how much complexity this adds into queryFeatures() to see if it is worth it. But in most cases I'm looking at this as a chance to reduce API surface area and properly type the few methods that really need it.

tomwayson commented 3 years ago

My take aways from all this:

we should drop support for f=html
we should drop support for arbitrarily returning the fetch response (i.e. rawResponse) for any fn that wraps request() and limit that to fns we know need it (getResource(), getItemData())

We could further simplify things by adding rawRequest() that returns a fetch response and is what request() calls under the hood. The wrapping fns that need to provide the option to get the raw response could then have a raw option w/ type safe overrides, and call rawRequest() when raw: true. Alternatively we could have raw counter parts to those fns rawGetResource().

Either way, this really sums up what we should be doing:

a chance to reduce API surface area and properly type the few methods that really need it.

tomwayson commented 3 years ago

Also, based on @patrickarlt's previous comment I don't think we should support streaming queryFeatures() responses as decoded JSON arrays. We can add that in later if it's requested.

To restate and elaborate: we should take advantage of this breaking change to reduce the footprint of our API, reduce the internal complexity of request(), and get the typings right.

From that stable base we can build back out if needed.

thw0rted commented 3 years ago

Tom, when you say you wouldn't support streaming queryFeatures, do you mean not providing a streaming object API (using Node-like streams), or do you mean that, on second thought, you actually wouldn't provide the rawRequest / rawQueryFeatures / etc, as proposed two comments upthread? I never expected the library to provide fully decoded features stream-wise, but I'd like to at least have the option of getting a raw W3C or Node stream and handling the decode myself.

tomwayson commented 3 years ago

Sorry, I updated my comment for clarity. I do think we should still provide:

the option of getting a raw W3C or Node stream and handling the decode myself.

Esri / arcgis-rest-js

Update typings for functions that wrap `request()` to return `Response` when using `rawResponse` #821