ftrackhq / ftrack-ts-schema-generator

Gets the schema from an ftrack instance and generates typescript interfaces for all the entity types in the instance.
Apache License 2.0
2 stars 1 forks source link

fix: filter type discrimination #43

Closed ffMathy closed 2 months ago

ffMathy commented 2 months ago

Changes

After introducing the recent type discrimination, I talked about an issue that I then couldn't reproduce where if __entity_type__ was optional, type discrimination wouldn't work.

Now I can reproduce it. It only happens for Arrays.

Repro:

const x = null! as TypedContext;
if(x.__entity_type__ === "Shot") {
  //this is fine, x is a `Shot`
}

const array = new Array<TypedContext>();
const filteredArray = array.filter(x => x.__entity_type__ === 'Shot');
//here, if __entity_type__ is optional, then filteredArray is of type TypedContext[]. If it's not optional, it's of type Shot[]

So it'd be great if we could make __entity_type__ required.

gismya commented 2 months ago

That's a TypeScript bug and nothing we will fix through making the types incorrect

ffMathy commented 2 months ago

@gismya I agree with the premise here, but aren't these types actually correct? Won't __entity_type__ always be set?

That would make it possible for us to hit two birds with one stone.

gismya commented 2 months ago

They are optional when creating data, which is the data structure we're aiming for with the types. This fix would also be thwarted by https://github.com/ftrackhq/ftrack-javascript/pull/251#issuecomment-2293084238 for any data queried from the server.

ffMathy commented 2 months ago

But if it's optional when creating the types, it should be wrapped in Partial<T> in the client. Or else you lose valuable information from the types themselves, such as which fields are optional and always present, and which are not.

You're right that the Partial<T> is incorrectly applied on the return types. That's actually not ideal. Should probably just be on the data that is passed in to "update" or "create" call.

I'll try to elaborate further.

  1. Let's say I want to create a Project. The client should be able to tell me what the minimal properties for a project creation is (such as name etc). This belongs in the Project's type definition itself, like it is today. Here, the client should not wrap it in Partial<T>.
  2. Let's say I fetch a project from the server. Here, I also would like to know that name is always set etc, but we can't know that (unless we have the concept of a query builder or something similar which we have added to our own codebase). So we have to use Partial<T> here too.
  3. Let's say I update a project from the server. Here, it can be arbitrary what properties I provide to update, so that argument should be wrapped in Partial<T> for the client's update call.

You might argue that point 2 kind of breaks the discrimination as well, but it doesn't in our codebase, because we have the concept of a query builder. This is something I'd love to show at some point in the future, and I still have plans to invite to that meeting we talked about.

Let me know what your thoughts are on the above :heart:

ffMathy commented 2 months ago

Either way, I filed a bug here: https://github.com/microsoft/TypeScript/issues/59654

gismya commented 2 months ago

But if it's optional when creating the types, it should be wrapped in Partial<T> in the client. Or else you lose valuable information from the types themselves, such as which fields are optional and always present, and which are not.

Everything is optional (Except __entity_type__ as far as I know) when fetching data from the server, since you ask for the data you want. Some things are included by default, unless you're running the API in "strict mode", but that's a legacy remnant we want to get away from (But implemented a "strict mode" to not break existing pipelines).

You're right that the Partial<T> is incorrectly applied on the return types. That's actually not ideal. Should probably just be on the data that is passed in to "update" or "create" call.

Update and create are the ones that have non optional data. Or not sure about create, we might be adding all the non optional data as separate parameters instead of parsing it from the data, don't remember from the top of my head.

  1. Let's say I want to create a Project. The client should be able to tell me what the minimal properties for a project creation is (such as name etc). This belongs in the Project's type definition itself, like it is today. Here, the client should not wrap it in Partial<T>.

Yes

  1. Let's say I fetch a project from the server. Here, I also would like to know that name is always set etc, but we can't know that (unless we have the concept of a query builder or something similar which we have added to our own codebase). So we have to use Partial<T> here too.

Yes

  1. Let's say I update a project from the server. Here, it can be arbitrary what properties I provide to update, so that argument should be wrapped in Partial<T> for the client's update call.

Sounds right

The issue is that the data "lives" in multiple different forms, where what's optional and not differs between them. The one we can accurately model in TS is the create type. Almost. __entity_type__ technically doesn't exist when creating, it's appended when we're fetching. But it's the closest we've figured out while still keeping the TypeScript types manageable. We can get things closer to reality, but I don't see how setting __entity_type__ as required would move things in that direction.

ffMathy commented 2 months ago

Alright, thank you for that detailed explanation. I think I get most of it now.

Hmmm, the TypeScript team replied that it's not a bug, and I don't think they are going to fix it. That means that this whole feature (which is insanely powerful) will never work properly.

What if we somehow (and please don't think about the naming) do something like this:

If we did it like this, then we could use the type annotated one in results we get from the Ftrack API, and use the other one when we need to create or set data. It would allow us to utilize best of both.

gismya commented 2 months ago

Isn't it enough to make a utility types for "Everything optional except __entity_type__" thats added on the ActionResponse, instead of wrapping it in Partial<>, as being discussed here?

ffMathy commented 2 months ago

Yeah, but the problem with that is a bit more complex.

Because pragmatically and realistically, we'd want to use real types in our own functions.

Consider a function called doSomethingWithShot:

function doSomethingWithEntities(entities: TypedContext[]) {
   const shots = entities.filter(x => x.__entity_type__ === 'Shot');
   //do something with Shots
}

At some point, the function is called like this:

const fetchedEntities = await session.query('select whatever from TypedContext); //this is pseudocode for the sake of example.
doSomethingWithEntities(fetchedEntities);

Now, fetchedEntities would correctly be wrapped in whatever utility type we make with the proposed solution (in @ftrack/api). Let's call it Response<TypedContext>, which makes __entity_type__ required.

However, when we refer to these types in all kinds of places (for instance, in the entities argument of the doSomethingWithEntities function, we'd have to also wrap them in Response<T>. Kind of like:

function doSomethingWithEntities(entities: Response<TypedContext>[]) {
   const shots = entities.filter(x => x.__entity_type__ === 'Shot');
   //do something with Shots
}

I feel like 90% of the use-cases would need to have that wrapper around it. So even though that's technically the correct way to model it, I feel like that ignores the pragmatism aspect of it - how it is used in reality.

So my argument is that maybe it should default to non-optional, and then be wrapped in the places where it needs to be optional in the Ftrack client instead.

I guess it boils down to queries vs mutations. How often do we think we need to mutate versus how often we read? In our system at least, we probably have 20 times more reads than mutations. Therefore I'd optimize the API for such operations.

It's hard to explain. Does it make sense?

gismya commented 2 months ago

Does the problem only happen with filter? In that case I'd start by simply adding a type guard:

function isEntityOfType<K extends TypedContextSubtype>(
  obj: TypedContext,
  entityType: K
): obj is TypedContextSubtypeMap[K] {
  return obj && obj.__entity_type__ === entityType;
}
function doSomethingWithEntities(entities: TypedContext[]) {
   const shots = entities.filter(x => isEntityOfType(x, 'Shot'));
   //do something with Shots
}

or

function filterEntitiesByType<K extends TypedContextSubtype>(
  entities: TypedContext[],
  entityType: K
): TypedContextSubtypeMap[K][] {
  return entities.filter(
    (entity): entity is TypedContextSubtypeMap[K] => entity.__entity_type__ === entityType
  );
}
function doSomethingWithEntities(entities: TypedContext[]){
   const shots = filterEntitiesByType(entities, 'Shot');
   //do something with Shots
}

We could have a set of three different types for every one, one where it's like today; ShotCreate, one where __entity_type__ is the only required;ShotRead, one where everything is optional;ShotUpdate. But that's a larger change I'm not entirely sure of the benefits and drawbacks of.

ffMathy commented 2 months ago

Yes, it's only with filter.

The problem with the proposed solutions is that they are all opt-in. You have to use the right approach. It's not inferred by default usage.

And that's basically the same as just manually casting it. You don't really get any value unless you remember to call it the right way.

What about having the default Shot type with entity type required, and then wrapping that in the client with a utility type "WithOptionalType" which makes it optional in those cases? Any drawbacks of that?

From what I see, it would:

Alternatively we could have an overload of the update and create methods. Today, they take two arguments mainly. The entity type as a string and an object containing the things to mutate.

What if there was another overload that accepted the non-wrapped T that has the entity type on it, and then in turn wouldn't require you to set the entity type as an argument. Let me know if you dont know what I mean, and I'll provide an example (typing from my phone right now so code examples are hard).