vega / ts-json-schema-generator

Generate JSON schema from your Typescript sources
MIT License
1.39k stars 190 forks source link

Future of schema generators #101

Open domoritz opened 5 years ago

domoritz commented 5 years ago

This is a discussion about the future of Typescript JSON schema generators.

TL;DR

domoritz commented 5 years ago

From @HoldYourWaffle

I have created issues for most of the problems I encountered in YousefED/typescript-json-schema. I had to switch to that module because conditional types (particularily Omit) are not supported yet.

But there's something else I want to discuss. There are currently 3 modules that accomplish basically the same goal (at least that I know of), and they all have a lot of issues. vega/ts-json-schema-generator seems to be the best overall solution, featuring clean(er) code and proper type alias support. However, it doesn't support conditional types (#100), which means Omit, Pick, Exclude etc. arent' supported either (#71, #93). This makes this module completely unusable for me because I heavily use these language features. YousefED/typescript-json-schema does support these constructs, but it contains a lot of bugs (I have reported 8 so far). There's also xiag-ag/typescript-to-json-schema, but I have no idea what the difference is between this module and it's 'ancestor' (the README only says that this is an 'extended' version).

I see 3 possible solutions for this forkmania (though there could be more):

Fix YousefED/typescript-json-schema. This can sortof be seen as the 'default' option, as it would leave the current situation mostly untouched. This isn't my preferred solution because the architecture of vega/ts-json-schema-generator looks a lot better and it doesn't really solve the forkmania.

Write a new generator from scratch, using the knowledge and experience from the other three. If we were to do this we'd of course have the amazing power of hindsight and shared knowledge, which would allow us to write a clean, well-designed and future proof generator. However, this would be very time-consuming and since vega/ts-json-schema-generator is already very well designed I don't see much reason to do this.

Merge all the good parts into vega/ts-json-schema-generator and 'opening it up' for general usage. I think this would be the best option because it will actually fix the fork issue without consuming a lot of time and effort. I think this module currently has the best/cleanest code and I don't see a reason for using one of the others if we increased flexibility and supported more use cases than what vega is doing.

If you're interested I can write up a more detailed overview in a couple of hours. No matter what option you/we choose I'd love to help/fix/develop/maintain.

HoldYourWaffle commented 5 years ago

Good to see there's interest in my proposal! Perhaps it would be good to pin this issue so more people will see this and hopefully voice their opinion on the matter?

domoritz commented 5 years ago

Thank you for the comments. I will use this issue to explain some of the different philosophies behind the different libraries.

The goal of all of these libraries is to convert Typescript to JSON schema. There are different approaches to achieving this and also different interpretations that you can follow. One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other.

YousefED/typescript-json-schema

This was the original schema generator that I picked for my main use case, which is Vega-Lite. The philosophy of this library was to be flexible and configurable for different use cases. This means that some configurations work better than others because optimizing all of them is complex.

It worked fairly well once I extended it a bit to deal with some of the more complicated types we use. However, I constantly ran into issues with getting meaningful properties that correspond to aliased types. The fundamental problem is that this library uses the type hierarchy, and not the AST to create the JSON schema. Therefore I went on to look for a different library, which was xiag-ag/typescript-to-json-schema.

I stopped active development but occasionally review PRs and make releases. I consider this library to be in maintenance mode.

xiag-ag/typescript-to-json-schema

This library was mostly written by @mrix and uses the AST to create the schema. It is also much more modular. Rather than having one big file, there are separate modules for parsing and code generation for all different types of AST nodes and types. I had to extend it significantly to make it work for Vega-Lite. That's what vega/ts-json-schema-generator is.

vega/ts-json-schema-generator

This is the extended version of xiag-ag/typescript-to-json-schema. Besides extending the supported AST nodes, I also changed some of the behavior because it was not correct for some of my use cases. Some of these improvements have since been ported back into the original library but not all.

Overall, this library is much more robust than YousefED/typescript-json-schema and produces generally better schemas. It is also much more opinionated. I don't want to have options but instead, make particular design decisions and bake them into the code.

Even though this library still has some missing functionalities, it works in production for Vega-Lite, which is a complex piece of TypeScript. I keep fixing this library if it is necessary for Vega-Lite but otherwise don't have the cycles to do anything else.

I am happy to review PRs if they have sufficient tests, don't introduce options, and provide useful additions (I will reject support for functions because there is no clean mapping to JSON schemas).

I consider this library to be in active development and would be more than thrilled to have people help with it.

Conclusion

I have considered writing a new generator based on the things I have learned. Ideally, it would be a bit leaner than vega/ts-json-schema-generator and not have as many files. However, I don't have the cycles to do it and I'm not convinced that it will be much cleaner. Overall, I think investing resources into vega/ts-json-schema-generator would be the best way forward. I am happy to review PRs and make timely releases for this library. However, my number one priority is to support Vega-Lite and anything that is in conflict with that goal will be rejected (I don't see this as an issue, though).

Let me know what you think.

HoldYourWaffle commented 5 years ago

Since your comment is pretty big I'm just going to respond per section.


One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other.

This is of course true. I think the best way to handle this would be to have a section in the README that clearly lists all constructs that are not or only partially supported. It's annoying to discover something you need is unsupported, but discovering it after you've already started using an automated generator setup is way worse (I can unfortunately speak from experience).

Maybe we should also look into a way to manually override generated the generated schema in a fluent way. This way any shortcomings of the library can be manually filled in or corrected. In my own projects I've been using a script that manually changes the generated JSON object but this is very inflexible and error prone. I haven't really thought about how to implement such a mechanism, perhaps a JSdoc annotation with a JSON pointer could be something? I'll think about it.


This means that some configurations work better than others because optimizing all of them is complex.

I'm not sure I understand what you mean by this. Are you trying to say that more options → more complexity → hard to get working correctly? I agree that having more options might increase complexity, but there should always be a sensible default behavior. Having more options to override common sense because it's assumptions doesn't match with your use case is a good thing in my opinion.


It is also much more opinionated. I don't want to have options but instead, make particular design decisions and bake them into the code.

I'm not sure I agree with you on this. Having sensible defaults is always a good thing, and making some assumptions when designing something like this is necessary, but I don't see how this would hinder adding options. Could you give an example of an option you've rejected/don't want to add? I'm probably just misunderstanding what you're saying.


I keep fixing this library if it is necessary for Vega-Lite but otherwise don't have the cycles to do anything else. I am happy to review PRs if they have sufficient tests, don't introduce options, and provide useful additions (I will reject support for functions because there is no clean mapping to JSON schemas). I consider this library to be in active development and would be more than thrilled to have people help with it.

I'd love to help you maintain this project if you don't have the time for it! Again I'm not sure why you'd want to reject new options, could you clarify what you mean by this? And out of curiosity: why do people want to send functions over JSON? There's no function type in the JSON spec, nor can I think of a usecase where one would want to. If you really wanted to put functions in your JSON you can use Function.toString with eval on the other side, but this is pretty unsafe in almost all cases.


Ideally, it would be a bit leaner than vega/ts-json-schema-generator and not have as many files. However, I don't have the cycles to do it and I'm not convinced that it will be much cleaner.

This is the main reason why I think a new generator isn't the best option. Is there a reason why we can't make vega/ts-json-schema-generator leaner without rewriting the whole thing? Also, what's wrong with having more files? It makes it a lot easier to find what you're looking for, as well as leaving less room for weird global state anti-patterns.


Overall, I think investing resources into vega/ts-json-schema-generator would be the best way forward. I am happy to review PRs and make timely releases for this library. However, my number one priority is to support Vega-Lite and anything that is in conflict with that goal will be rejected (I don't see this as an issue, though).

I also think this is be the best way forward. I don't see a reason why there would be conflicts with Vega, since the goal of a general purpose library is to support most (if not all) usecases, Vega included of course.


If we decide to adopt this strategy there's one more issue remaining: uniting the modules/forks to fix the current forkmania. I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module (as soon as it's ready of course, mainly looking at conditional types).

xiag-ag/typescript-to-json-schema is a different story though. I found this PR by you that aims to merge the 2 repositories together, but as you already know there hasn't been any response in 2 years. It seems like @mrix has disappeared from the community, which of course doesn't help our case.

The main reason why I want the forks to be united is that the current situation is really confusing. Last week I basically went like this:

Removing YousefED/typescript-json-schema from the equation would help a lot. We may never get a response from @mrix, but we can remove this repository's forked status or add a clear explanation in the README on what the differences are and why you should probably use this module instead of the upstream one.


On a completely different note, maybe it's a good idea to create an issue in YousefED/typescript-json-schema referencing this issue and pin it there too. Since that module has more users we're probably going to get more responses then.

domoritz commented 5 years ago

One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other.

This is of course true. I think the best way to handle this would be to have a section in the README that clearly lists all constructs that are not or only partially supported. It's annoying to discover something you need is unsupported, but discovering it after you've already started using an automated generator setup is way worse (I can unfortunately speak from experience).

The issue here really is JSON schema and not Typescript. I have found good ways around missing things in typescript such as maxLength but the other way around it much messier. For example, JSON schema neither supports union types or inheritance properly.

This means that some configurations work better than others because optimizing all of them is complex.

I'm not sure I understand what you mean by this. Are you trying to say that more options → more complexity → hard to get working correctly? I agree that having more options might increase complexity, but there should always be a sensible default behavior. Having more options to override common sense because it's assumptions doesn't match with your use case is a good thing in my opinion.

More options increase the number of paths through the code and make it harder to get right. I am definitely against adding more code paths just to support another use case. Instead, the defaults should be good.

As an example, including or not including aliases or not using a top-level reference have implications on many parts of the code. I can speak from experience that not having the config options avoided a bunch of headaches.

I am not against the ability to configure things that have only implications on localized pieces of the code.

And out of curiosity: why do people want to send functions over JSON? There's no function type in the JSON spec, nor can I think of a usecase where one would want to.

I agree and still there we got PRs and issues for it: https://github.com/YousefED/typescript-json-schema/issues?utf8=%E2%9C%93&q=functions+

Is there a reason why we can't make vega/ts-json-schema-generator leaner without rewriting the whole thing?

No. I think the current design is good enough and I don't see any reason to rewrite.

I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module

It's already in maintenance mode and I think that's what it should be. A forward link sounds good to me but then I want support with issues that people report ;-)

xiag-ag/typescript-to-json-schema has a few features that we should get working in this fork. See https://github.com/vega/ts-json-schema-generator/issues/63

On a completely different note, maybe it's a good idea to create an issue in YousefED/typescript-json-schema referencing this issue and pin it there too.

Go ahead. I will pin it. I already added a note to https://github.com/YousefED/typescript-json-schema#background. Before we can deprecate the other library, we need support for conditionals here.

HoldYourWaffle commented 5 years ago

The issue here really is JSON schema and not Typescript. I have found good ways around missing things in typescript such as maxLength but the other way around it much messier. For example, JSON schema neither supports union types or inheritance properly.

You really did a good job expressing stuff like minLength! It's very fluent and it just makes a lot of sense. It's also self-documenting by definition, which is always a good thing. I'm not sure why union types would be a problem, isn't that just oneOf? Inheritance is a common issue with JSON schema, but since this is an automated tool it's probably not that bad to have some duplication in the output schema (perhaps adding a description field with the original information would be good for clarity/readability?).

It should also be noted that without additionalProperties: false it's perfectly possible to express inheritance using allOf, so if we really wanted to support a form of inheritance preservance this could be an option, but I think this will get needlessly complex very quickly. Since this is an issue with the JSON schema spec itself and not with this module I think a clear explanation on why this isn't possible in the README would suffice for now.


More options increase the number of paths through the code and make it harder to get right. I am definitely against adding more code paths just to support another use case. Instead, the defaults should be good. As an example, including or not including aliases or not using a top-level reference have implications on many parts of the code. I can speak from experience that not having the config options avoided a bunch of headaches. I am not against the ability to configure things that have only implications on localized pieces of the code.

That makes a lot of sense. So something like --strictTuples (is it the default yet?) or --strictNullChecks (however misleading it may be) would be fine? I agree that configuring the overall shape of the schema will get messy very quickly, but as far as I can see there's no reason why options for individual constructs should be rejected.


I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module

It's already in maintenance mode and I think that's what it should be. A forward link sounds good to me but then I want support with issues that people report ;-)

That makes sense, but is there really a reason why someone would want to use the other module once we include all missing feature here? It's always good to keep providing support for something, but is it really worth the effort if this were to be a (practically) drop-in replacement?


xiag-ag/typescript-to-json-schema has a few features that we should get working in this fork. See #63

I'd love to help, but I can't figure out what new features we're missing since there are so many changes in the vega version. If you can give me some kind of list I'd be more than happy to take a look.


Go ahead. I will pin it.

Done. I also created an issue in @xrim's repository in case there are more lost souls like I was.

domoritz commented 5 years ago

I'm not sure why union types would be a problem, isn't that just oneOf

I meant intersection types. allOf does not work because of additionalProperties: false as you noted in inheritance as well.

So something like --strictTuples (is it the default yet?) or --strictNullChecks (however misleading it may be) would be fine?

Yep. I think my request to keep the number of code paths low is reasonable and I think you agree.

That makes sense, but is there really a reason why someone would want to use the other module once we include all missing feature here?

In the future, yes.

I'd love to help, but I can't figure out what new features we're missing since there are so many changes in the vega version. If you can give me some kind of list I'd be more than happy to take a look.

See https://github.com/vega/ts-json-schema-generator/compare/master...xiag-ag:master.

HoldYourWaffle commented 5 years ago

Yep. I think my request to keep the number of code paths low is reasonable and I think you agree.

Of course! I was just wondering where your "line" was on what's too complicated, glad to hear it's in a very reasonable place.

See master...xiag-ag:master.

I tried to look through it, but I fear I'm just not well versed enough in the codebase to know what changed, what hasn't been done here already and what is even applicable to our version. Maybe we could copy over the tests that were added, see which ones fail and go from there? I'd love to try it but I'll have to figure out the test infrastructure first, which is of course going to take some time.

I meant intersection types. allOf does not work because of additionalProperties: false as you noted in inheritance as well.

The more I use JSON schema the more I think "How have they not solved this yet?". I think from a spec perspective there are 2 logical solutions:

  1. Ignore additionalProperties in an allOf clause. I can't think of a use-case where this would be actual useful behavior.
  2. Allow additional keys on a $ref "schema" to supplement/override the reference.

These "ideas" aren't very useful from a generator perspective of course since no validator supports them. The only solution I see is to duplicate & merge the inherited and inheriting schemas but this would probably get really messy really quickly (both in the code and in the output). Maybe adding a description to the frankensteined output would help? I feel like there should be a better solution though... How is this currently handled?

domoritz commented 5 years ago

How is this currently handled?

I merge the objects in the allOf into one big object.

HoldYourWaffle commented 5 years ago

Are there any issues with this approach apart from messy output?

domoritz commented 5 years ago

If you want to generate code from the schema again, the information about intersections and inheritance is lost.

HoldYourWaffle commented 5 years ago

That makes sense. Maybe adding some kind of note to the schema could help with that?

ForbesLindesay commented 5 years ago

The problem with deprecating YousefED/typescript-json-schema is that it's the only one that handles conditional types properly. Supporting them requires doing type inference that is on a par with TypeScript itself. Without using getTypeAtLocation, it is incredibly difficult to keep up with the fast pace of TypeScript language improvement.

I think there could be enormous value in refactoring typescript-json-schema to be more modular, and paring down the list of options to reduce the complexity of the many code paths. There also could be merit to using an AST first approach - i.e. do what we can via traversing the AST, where alias refs will work really well, and only fall back to getTypeAtLocation for complex/generic types.

I would be quite interested in taking on this task (I already maintain typescript-json-validator which is a wrapper around typescript-json-schema), but didn't want to before because I don't want to either:

  1. I don't want to contribute to there being "yet another fork".
  2. I don't think the AST based approach is likely to bare a lot of fruit.

Having said that, I have also started work on typeconvert, which aims to do type inference on babel ASTs to convert between TypeScript and Flow (and generate documentation, JSON Schema etc.) Unfortunately it's nowhere near ready for release yet though as I keep realising I've made a mistake and need to fundamentally refactor.

domoritz commented 5 years ago

Thank you for your message. I agree that a hybrid approach might work well but it's hard to say until we have a working implementation. For me personally, my concern is that I can generate schemas for Vega-Lite. Maybe you can use that as a test case for another implementation of a hybrid schema generator?

HoldYourWaffle commented 5 years ago

I think a hybrid approach could definitely be a good solution. I don't see how this would contribute to the 'yet another fork problem', since I think this can just be integrated into one of the existing ones (preferably this one of course)? I don't know much about the internals of either code base though so I might be completely wrong here.

Supporting them requires doing type inference that is on a par with TypeScript itself

Perhaps it's possible to reuse the logic TypeScript itself is using? Visual Studio Code has (in my experience) near perfect "reflection" on TypeScript code so it should be possible. I think VS Code uses something like this, maybe that's something worth looking into? Again I'm really not qualified to make any well founded argument about this but I try to help as much as I can.

sparebytes commented 5 years ago

Maybe we can do something like this: ast -> io-ts -> json schema. io-ts does a good job representing types at runtime. Looks like their v2.0 milestone includes generating json schema.

Excuse me if this is way off base, I can't read the whole thread right this minute.

kayahr commented 5 years ago

According to this table mapping types, conditional types or even specific types like Exclude or Omit are not supported in io-ts. So for converting ast to io-ts we still have to do all the complex mapping/condition resolving as we do it now. So nothing gained here in my opinion.

And what about annotations? Currently it is very easy to pass them from typescript to the JSON schema. With io-ts in between this will probably be more difficult.

I think adding io-ts into the chain just adds more complexity and slows down the project.

codler commented 4 years ago

I was about to start try out Typescript to JSON Schema and then I saw this thread. I am now confused which library I should use. Where are we at today and which one do you recommend to use?

domoritz commented 4 years ago

@codler I added a tldr to the first message. Does that help?

codler commented 4 years ago

@domoritz Thank you that helped!

nonara commented 4 years ago

Hi all. Could anyone tell me exactly what does not work with the getTypeAtLocation version?

I've written a working ObjectLiteral parser and have AsExpression in the works, but as I do, it just seems more and more like the wrong route to go in not using the built in TypeChecker functionality, especially as we start dealing with objects, etc. The primary reason is that the language changes and evolves. Take, for example, the addition of 'as const'. This drastically impacts the derived type.

This means as things change or are added, we're constantly recreating TypeScript's work, which is both inefficient and bound to encounter bugs, as we have to try to determine and replicate all of the nuance each time, which means digging through source code.

I completely understand @domoritz 's point of not having the time to rewrite things. With that said, my situation is currently the opposite. I'm full-time writing a library (TS extension via ts-patch) which allows type validation during runtime via schema generation and ajv. The intent isn't to be just yet another fork, but rather, as some have suggested, a hybridized version which combines all of what we know.

To that end, I've already slimmed down and optimized the majority of this library and am close to completing it. What would be very helpful for me is to have the exact cases in which the former library which uses getTypeAtLocation fails to work. Also, if anyone would like to join me, please let me know. I'd be glad to collaborate with you on what I believe will be an incredibly useful TS extension!

Thanks, again, to all who've worked on this and committed their time so far!

domoritz commented 4 years ago

Hi @nonara. Thank you for your comment. Your assessment of the situation is accurate. It's a lot of work to maintain this library and keep up with the development of TypeScript. Moreover, we have bugs because I haven't kept up with the development or misinterpreted the AST.

For a bit of context, I have worked on both https://github.com/YousefED/typescript-json-schema and https://github.com/vega/ts-json-schema-generator. While the former was a lot simpler, I struggled to get aliases right and put effort into the latter instead.

My use case is to generate a JSON schema for Vega-Lite. It is super important that the schema is readable and that means that type names from TypeScript appear in the JSON schema as well. We use the names of TS types both for our documentation and also to generate Altair. ts-json-schema-generator works quite well for my use case but I have run into bugs from time to time (as well as bad aliases).

All of this is to say that I am not dogmatic about not using getTypeAtLocation. I personally just haven't figure out a way to make it work well with aliases. However, the last time I really looked into this was a few years ago and I didn't know much about the TypeScript compiler then. I would be super thrilled to use a simpler approach that uses type inference provided by the compiler rather than walking over the AST (which is difficult and error prone).

cspotcode commented 4 years ago

Is it possible for a maintainer to post a minimal example of the issue with aliases? I posted two examples below; are either or both of those cases accurate?

I've worked with the compiler APIs before, so I'm not a beginner, and I don't mind if the explanation dives into compiler internals.


Example 1

export interface SomeInterface {
    foo: string;
    bar: number;
}
type SomeAliasIWantToAppearInSchema = SomeInterface;

getTypeAtLocation cannot see SomeAliasIWantToAppearInSchema; it can only see SomeInterface. The schema is incomplete.


Example 2

export interface SomeInterfaceA {
    foo: AliasedType;
}
export interface SomeInterfaceB {
    bar: AliasedType;
}
type AliasedType = {baz: 'biff'};

getTypeAtLocation says that bar is of type {baz: 'biff'}. It does not give any indication that there's an alias, so the aliased type is inlined in the schema twice. This causes undesirable duplication in the schema.

domoritz commented 4 years ago

If you want to see more test cases, I have accumulated a bunch of them over the years at https://github.com/vega/ts-json-schema-generator/tree/master/test/valid-data.

Your examples are good and they show that ts-json-schema-generator works well here. I would be happy about any solution involving getTypeAtLocation that can replicate this behavior.

Here is example 1 in ts-json-schema-generator

export interface SomeInterface {
    foo: string;
    bar: number;
}
export type SomeAliasIWantToAppearInSchema = SomeInterface;
$ ts-node ts-json-schema-generator.ts -p test.ts -t SomeAliasIWantToAppearInSchema
{
  "$ref": "#/definitions/SomeAliasIWantToAppearInSchema",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "SomeAliasIWantToAppearInSchema": {
      "$ref": "#/definitions/SomeInterface"
    },
    "SomeInterface": {
      "additionalProperties": false,
      "properties": {
        "bar": {
          "type": "number"
        },
        "foo": {
          "type": "string"
        }
      },
      "required": [
        "foo",
        "bar"
      ],
      "type": "object"
    }
  }
}

And example 2 with AliasedType exported (which is probably what you want and it avoids the issue @cspotcode pointed at).

export interface SomeInterfaceA {
    foo: AliasedType;
}
export interface SomeInterfaceB {
    bar: AliasedType;
}
export type AliasedType = { baz: "biff" };
$ ts-node ts-json-schema-generator.ts -p test.ts
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "AliasedType": {
      "additionalProperties": false,
      "properties": {
        "baz": {
          "enum": [
            "biff"
          ],
          "type": "string"
        }
      },
      "required": [
        "baz"
      ],
      "type": "object"
    },
    "SomeInterfaceA": {
      "additionalProperties": false,
      "properties": {
        "foo": {
          "$ref": "#/definitions/AliasedType"
        }
      },
      "required": [
        "foo"
      ],
      "type": "object"
    },
    "SomeInterfaceB": {
      "additionalProperties": false,
      "properties": {
        "bar": {
          "$ref": "#/definitions/AliasedType"
        }
      },
      "required": [
        "bar"
      ],
      "type": "object"
    }
  }
}

and with AliasedType hidden (not super important to support IMHO)

export interface SomeInterfaceA {
    foo: AliasedType;
}
export interface SomeInterfaceB {
    bar: AliasedType;
}
type AliasedType = { baz: "biff" };

you get

$ ts-node ts-json-schema-generator.ts -p test.ts
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "SomeInterfaceA": {
      "additionalProperties": false,
      "properties": {
        "foo": {
          "additionalProperties": false,
          "properties": {
            "baz": {
              "enum": [
                "biff"
              ],
              "type": "string"
            }
          },
          "required": [
            "baz"
          ],
          "type": "object"
        }
      },
      "required": [
        "foo"
      ],
      "type": "object"
    },
    "SomeInterfaceB": {
      "additionalProperties": false,
      "properties": {
        "bar": {
          "additionalProperties": false,
          "properties": {
            "baz": {
              "enum": [
                "biff"
              ],
              "type": "string"
            }
          },
          "required": [
            "baz"
          ],
          "type": "object"
        }
      },
      "required": [
        "bar"
      ],
      "type": "object"
    }
  }
}
nonara commented 4 years ago

Thank you both for your responses! @domoritz Is it safe to say, if we can get passes on everything in valid-types, it should work?

domoritz commented 4 years ago

I think it's a pretty comprehensive test set. I'm also open to changing some of the outputs (e.g. always output aliases even if they are not exported).

cspotcode commented 4 years ago

I played around with the compiler API last night. I tried extracting types solely from the type system, avoiding the AST entirely.

I created a program, then grabbed the exports for a given source file:

sourceFileNode = program.getSourceFileByPath()
sourceFileSymbol = typeChecker.getSymbolAtLocation(sourceFileNode)
exportsSymbols = typeChecker.getExportsOfModule(moduleSymbol)

Then I use getDeclaredTypeOfSymbol() whenever possible. For property declarations on an interface, getTypeOfSymbolAtLocation(symbol, sourceFileNode) is necessary since property declarations are a "value" symbol.

interface Foo {prop: MyAlias} worked correctly and gave me a reference to the MyAlias type / symbol.

But interface Foo {prop: Omit<Whatever, never>} did not work as expected. It gave me a Pick type, skipping over Omit. I guess if an alias points at another alias, the first is skipped over.


What if a schema generator was built as follows:

a) Extract type info by navigating symbols / types exclusively, without looking at the AST. b) Have a list of special-cases where the AST is consulted to see if it's a ts.isTypeReferenceNode. (alias) If so, convert the TypeReferenceNode to a "$ref". If not, fallback to (a).

If (b) ever fails, the fallback is always (a). This means (b) only needs to deal with TypeReferenceNodes, because they might be aliases. Everything else is handled by the inference-driven (a). (b) can initially be a no-op, then built up over time.

If new TypeScript features are released, schemas will still be generated correctly. Updates to (b) might be required to achieve ideal alias behavior, but these updates are not necessarily blocking users from upgrading their TypeScript compiler.

domoritz commented 4 years ago

Thank you for describing your approach. This sounds a lot more reliable and not too complicated. Let me know when you take a stab at it. I’m happy to even retire or replace this library with your code when it works for Vega-Lite.

HoldYourWaffle commented 4 years ago

Glad to see this library getting some love again, even though I don't fully understand the discussion anymore 🙃

cspotcode commented 4 years ago

What is a good place to ask about nitty gritty compiler internals? Is there a subreddit, a Slack community, or something where people like us congregate?

It's tough to ask questions about compiler internals because most people are users of typescript, and they never get into the compiler APIs.

For example, my question right now is if the public API lets me differentiate between true and false boolean literal types.

I'd also like to share what I've learned but don't have a great place to put it.

domoritz commented 4 years ago

For questions maybe https://discordapp.com/invite/typescript. For a place to put things, I just activated the wiki in this repo so you can use that.

nonara commented 4 years ago

Just discovered discord a few days ago. Compiler chat is great. Actual MSFT employees on the TS team. The other is https://gitter.im/Microsoft/TypeScript - but mentioning compiler usually gets crickets.

@cspotcode - Shoot me a PM on Discord when you have a second to chat. I believe I've found a much simpler approach. I'd love to chat a bit and see if you think it's viable and get an idea at what your use-case is.

domoritz commented 4 years ago

I'm super excited to see a simpler ts -> json schema compiler.

nonara commented 4 years ago

Hi all! I'm nearly finished. TypeAlias preservation works. I'm not going to go into detail on how just now, because I'd rather just get it done and hopefully we can end the fork-mania and many, many implementations trying to solve this problem.

I'd like to get a few opinions from the group, if anyone would care to share. Please have a read of where we're at so far. Questions will follow.

Background

Plugin-based use

The central library is built to run as a plugin (via ts-patch), which incorporates both a Transformer-based plugin and a Language Service plugin in one package. This gives us some powerful new features to make this work and feel more like a native function of TypeScript, itself.

Here's what it can do:

  1. Set your options in a local config file & whenever tsc emits, your schema files emit as well. Standard is to emit [filename].schema.json, along-side your .js file, however, you can configure it to emit a single file, or even change the behaviour (more detail below).

  2. Because of the Language Service component of the plugin, your JSDoc annotations have intellisense, autocompletion, and IDE syntax errors.

Programmatic / CLI

Due to the extensibility described below, programmatic use should not be needed in the majority of cases, but it's still considered critical and the library is built around that concept.

CLI is available as well for those who dont want to use ts-patch or want to separately trigger schema emit.

Extensibility

A lot of bloat seems to exist around different base options. I've seen a lot of people requesting new options for their edge-cases. While I do believe that most are perfectly valid and understandable in their individual cases, it seems more sensible to offer a base-set that have the most common use and beyond that, make a few simple hooks that allow users to change how types are emitted.

Hooks

The following hooks are built in:

onTypeParse

Get information about each base type after its parsed, and optionally modify how, where, and if it's emitted Signature: (schemaType: SchemaType) => SchemaType['output'] [SchemaType]:

Note: If a user wants to omit a type from emitting at all, they can return false

beforeEmit

Can modify a full schema file before it's emitted Signature: (files: SchemaFile[]) => SchemaFile[] [SchemaFile]: (all fields mutable)

beforeParse

Provides list of filenames before they're parsed. Can be modified to alter what gets parsed / how Signature: (rootFileNames: string[], compilerOptions: CompilerOptions)

Options

I'd like to keep base options somewhat sparse, but if over 10% of users need to create hooks to accomplish their goal, then we're probably missing something we should have. Here is what we have so far:

type EmitTarget = 'exported' | 'tagged' | 'internal' | 'all'

export interface TypeSchemaOptions {
  /** When set, all types are emitted to a single file */
  outFile?: string,

  /** Default is 'all' (any inclusion of 'all' in array will behave as 'all' is the sole target) */
  emitTarget?: EmitTarget | EmitTarget[]

  /** Default is tsconfig's `outDir` */
  outDir?: string,

  /** 
    * Default: `<packageName>/~/`
    */
  uriRoot?: string,

  /** (See below) (Default: true) */
  addRelativePathToURI?: boolean,

  /** Whether $id tag is added to schema files (Default: false) */
  noFileIds?: boolean,

  /** Prevent $comment from being added from JSDoc description */
  noDescriptionComment?: string,

  hooks?: { /* described above */ }
}

Default Behaviour

The following are the current design choices for the library

Annotations

Instead of many tags, each on their own line, I propose that we keep it simple to several. (Remember that we have intellisense, completion, and IDE validation, so objects behave like objects, even if they're in JSDoc)

Note: You might want to scroll down and read example section first to see real-use cases, then come back to this

We will support the following tags:

@schema

Super-impose custom schema onto generated schema

Type: Schema definition Valid for: Type Declarations & Properties Output: { ...generatedSchema, ...annotatedSchema }

@schema.base

Serve as base for generated schema

Type: Schema definition Valid for: Type Declarations & Properties Output: { ...annotatedSchema, ...generatedSchema }

@schema.lock

Serve as sole schema definition. Generated schema is not used

Type: Schema definition Valid for: Type Declarations & Properties Output: { ...generatedSchema }

@schema.emit

If emitTarget includes tagged, this will be emitted

Type: Emit flag Valid for: Type Declarations

@schema.exclude

Will not be included in emitted schema

Type: Emit flag Valid for: Type Declarations & Properties

@schema.optional

Mark property optional

Type: Emit flag Valid for: Properties

@schema.required

Mark property required

Type: Emit flag Valid for: Properties

@schema.noRef

Will not create a ref element, instead, it will resolve the type.

Type: Emit flag Valid for: Properties

@schema.noDescriptionComment

Don't use JSDoc description for $comment

Type: Emit flag Valid for: Declarations & Properties

Examples

/**
 * Username Type
 * @schema { regex: /^[a-zA-Z0-9_]+$/ }
 */
type UserName = string;

/**
 * Will not emit at all
 * @schema.exclude
 */
type HiddenType = { a: number }

type AnotherType = { b: string }

/**
 * @schema.emit
 */
interface Abc {
  /**
   * This property won't be included
   * @schema.exclude
   */
  hiddenProp: number,

  /**
   * Schema will reflect this property as a number (not an object), and it will be optional
   * @schema.optional
   * @schema.lock { type: 'number' }
   */
  someProp: { a: number },

  /**
   * Will be a ref and will be required in output schema
   */
  otherProp?: UserName

  /**
   * Will NOT be a ref, instead it will resolve the schema for HiddenType
   */
  anotherProp: HiddenType

  /**
   * Will NOT be a ref, because of tag, will resolve to schema for AnotherType
   * @schema.noRef
   */
  anotherProp2: AnotherType
}

Output

Some things to keep in mind

Highlights

Example

<SchemaFile>[]

schemaFiles:SchemaFile[] = [
  // Comes from ./main.ts
  {
    'fileName': '/full_path_to/my_pkg/main.schema.json',
    'schema': {
      '$id': '@pkg_scope/my_pkg/~/main.schema.json',
      '$schema': 'http://json-schema.org/draft-07/schema#',
      'exports': {
        'ABCD': {
          'anyOf': [
            { 'ref': '@pkg_scope/my_pkg/~/main.schema.json#/internal/A' },
            { 'ref': '@pkg_scope/my_pkg/~/main.schema.json#/external/B' },
            { 'ref': '@pkg_scope/my_pkg/~/lib/helpers.schema.json#/exports/C' },
            { 'ref': 'sub_package/~/main.schema.json#/external/D' }
          ]
        },
      },
      // Internal is used for all non-exported types
      'internal': {
        'A': { 'type': 'string' }
      },
      // External are references to types which exist outside of the package root files (or those supplied programmatically)
      'external': {
        'B': { 'type': 'string' }
      }
    }
  },
  // Comes from ./lib/helpers.ts
  {
    'fileName': '/full_path_to/my_pkg/lib/helpers.schema.json',
    'schema': {
      '$id': '@pkg_scope/my_pkg/~/lib/helpers.schema.json',
      '$schema': 'http://json-schema.org/draft-07/schema#',
      'exports': {
        'C': { type: 'string' }
      }
    }
  },
  // Comes from ./packaged/sub-package/main.ts
  // Note: This file has its own package.json, therefore the sub-package name is used.
  {
    'fileName': '/full_path_to/my_pkg/packaged/sub-package/main.schema.json',
    'schema': {
      '$id': 'sub_package/~/lib/main.schema.json',
      '$schema': 'http://json-schema.org/draft-07/schema#',
      'exports': {
        'D': { type: 'string' }
      }
    }
  }
];

Questions

First, thank you for participating! The hope is for this to be a forum to decide on the final standard for the library that we can all be happy with moving forward.

I will be finishing this over the next week - possibly a little longer. Let's discuss and figure out what works best for everyone.

Some outstanding questions I have are:

cspotcode commented 4 years ago

Is there anything I'm missing?

Nice! Where's the code? It's tough to assess without throwing it at my team's codebase and seeing what comes out.

Flags

Is there an example of where we set flags in a config file? Do they go in our tsconfig, inside the plugin object?

nonara commented 4 years ago

Nice! Where's the code? It's tough to assess without throwing it at my team's codebase and seeing what comes out.

Not live yet. The idea was to open the floor up on the proposed standard, first. I will post the source shortly, when it's ready to use. My main goal is to make sure that the wider audience is good with the extensibility and proposed default behaviours.

I'm hoping that we can end the fork-mania, so I'm choosing to wait until it's complete to post the source.

Flags

I meant to re-label those as options.

After some thought, and because we're allowing for hooks (which will be in the form of functions), I propose that we load the options from a file in package root, schema.config.ts. By using TS, we can also allow for intellisense completion.

An example file might look like this:

export = <SchemaOptions> {
  noFileIds: true,
  hooks: {
    onTypeParse (schemaType: SchemaType) => {
      // Route interfaces to a special path
      if (schemaType.kind === 'interface') return { ...schemaType.output, propertyPath: 'Interfaces' }
      // Don't include Types that start with 'Secret'
      if (/^Secret/.test(schemaType.name)) return false;
    }
  }
}
domoritz commented 4 years ago

@nonara I like your design decisions! Making the library extensible and the core opinionated sounds like a good approach for the one library to rule them all.

As others, I would need to play with the actual implementation before committing to a new library. Let us know when you have something to test out. If we all think the new library is the way forward, I am happy to archive ts-json-schema-generator and typescript-json-schema.

nonara commented 4 years ago

@domoritz Glad to hear it! Thanks for the reply. I will update everyone again soon on the progress.

maneetgoyal commented 4 years ago

Thanks @nonara for the nice and detailed background on the upcoming tool. Looking forward.

Is there anything I'm missing?

In my project, we are using a monorepo setup, so it will be great if the extends property in tsconfig is respected. It may be the case already but just wanted to be sure. There was one issue reported too (https://github.com/YousefED/typescript-json-schema/issues/326) by @cspotcode.

nonara commented 4 years ago

@maneetgoyal Thanks for that suggestion! Normal plugin use would use what was loaded by tsc, but programmatic was lacking in that respect.

Looks like I can use getParsedCommandLineOfConfigFile in the TS compiler to make sure I cover those bases.

nonara commented 4 years ago

While I'm here, here's a quick update on the status:

The holiday season slowed me down quite a bit. I've also been caring for a family member after a surgery which meant loss of her dominant arm for a few months. <right hand man pun here />

I've also slowed down intentionally a bit, just to make sure I'm really thinking through any possible eventualities that haven't been covered in previous implementations as well as using the compiler API to the fullest. Basically, that's meant a lot of time studying TS compiler source and stepping through code to understand the intricacies of Type, Symbols, flags, etc.

The good news is, I'm into the final stretch. I'm just now wrapping up the last bits of the generator. It should be passing all of the valid-types tests and more in the next couple of days.

I'll be out for the week of the 12 - 18th, but with the holidays over, I should be able to dedicate the majority of my time to wrapping up. Barring any complication, I hope to have it up for everyone to test with their code-base around the end of the month.

Happy holidays!

domoritz commented 4 years ago

@nonara, I hope your family member is doing well again.

I am really excited about a cleaner implementation of this library and look forward to trying it.

domoritz commented 4 years ago

@nonara Is there a preview version we can play with? I'd love to take your schema generator for a spin and provide feedback.

nonara commented 4 years ago

@domoritz Thanks! It's just about ready to pre-release for review. Just wrapping up a few last bits, now. I've been able to be back on it full-time, and it's very close.

Looking forward to the input!

While I have you, I had a question regarding generics. I recall seeing an issue or two raised on the naming convention. Was there a consensus on naming?

I'm looking at either:

  1. MyGeneric<MyType, true>
  2. MyGeneric<T=MyType, TIsUnique=true>

Given type MyGeneric<T, TIsUnique extends Boolean> = ...

I think having the var names might have some value and am inclined to go with option 2. Any thoughts/objections?

Also, IIRC, someone raised an issue with the < > symbols. Do we need to look at a different format?

domoritz commented 4 years ago

I'd prefer 1) since it's shorter and the position of the argument is sufficient.

Also, IIRC, someone raised an issue with the < > symbols. Do we need to look at a different format?

It's not a big issue. It's just that references need to be valid URIs so you need to encode the string. Note that only the reference, not the definition name need to be URIs.

CaselIT commented 4 years ago

@nonara Any update on this? Is the code available?

nonara commented 4 years ago

@CaselIT Very close! I feel bad for the poor time estimate. As I continued in, I realized that it would be better served as a more well-built package. Some additional features now built in:

It can parse methods and functions as well. Here's a peak at the ts-extras metaschema:

/**
 * @see core/resources/type-schema-draft-01-example.ts
 */
export abstract class TsExtrasDraft_2020_04<TDraft extends IJsonSchemaDraft> implements ITsExtrasDraft {
  title = 'TS-EXTRAS';
  version = '2020_04';
  URI = ''; // TODO

  tsTypes = [ 'interface', 'class', 'object', 'type', 'method', 'function' ] as const;

  /* ********************************************************* */
  // region: Types
  /* ********************************************************* */

  abstract TsType: this['tsTypes'][number];

  abstract TypeParameter: {
    constraint?: TDraft['JsonDefinition']
    default?: TDraft['JsonDefinition']
    value?: TDraft['JsonDefinition']
  };

  abstract FunctionParameter: {
    name?: string
    type: TDraft['JsonDefinition']
    optional?: boolean
  };

  abstract FunctionSignature: {
    name?: string
    parameters: Array<TsExtrasDraft_2020_04<TDraft>['FunctionParameter']>
    returnType: TDraft['JsonDefinition']
    restParameter?: TsExtrasDraft_2020_04<TDraft>['FunctionParameter']
  };

  abstract Schema: TsExtras_2020_04<TDraft>

  // endregion
}

export interface TsExtras_2020_04<TDraft extends IJsonSchemaDraft> {
  $tsType?: TsExtrasDraft_2020_04<TDraft>['TsType']

  /**
   * $tsType: 'interface' | 'class'
   * Array of references
   */
  $heritageObjects?: Array<this>

  /**
   * $tsType: 'interface' | 'class' | 'object' | 'type' | 'method' | 'function'
   * Values make use of and $extends, default, supplied value is in value root
   */
  $typeParameters?: Record<string, TsExtrasDraft_2020_04<TDraft>['TypeParameter']>

  /**
   * $tsType: 'method' | 'function'
   */
  $functionSignature?: TsExtrasDraft_2020_04<TDraft>['FunctionSignature']

  /**
   * Object keyword
   */
  propertyOrder?: string[]
}

And an example output:

// TODO - Add heritage examples

// Class
export class ABC<G extends string> {
  a!: string;
  b!: G;
  c<T, B extends string = string, C = never>(p1: T, p2: B, p3?: number, ...args: any[]): C {
    return <any>null
  }
  d = function Hello<T>(a: any): T { return <any>null }
}

// Schema
const SchemaResult: JsonSchema = {
  '$tsType': 'class',
  '$typeParameters': {
    'G': {
      'constraint': { type: 'string' },
    }
  },

  'type': 'object',
  'properties': {
    'a': { type: 'string' },

    'b': { $ref: '#/$typeParameters/T' },

    'c': {
      '$tsType': 'method',
      '$typeParameters': {
        'T': {},
        'B': {
          'constraint': { type: 'string' },
          'default': { type: 'string' },
          'value': { '$ref': '#/properties/c/$typeParameters/B/default' }
        },
        'C': {
          'default': false
        }
      },
      '$functionSignature': {
        'parameters': [
          {
            'name': 'p1',
            'type': { $ref: '#/properties/c/$typeParameters/T' }
          },
          {
            'name': 'p2',
            'type': { $ref: '#/properties/c/$typeParameters/B' }
          },
          {
            'name': 'p3',
            'type': { type: 'number' },
            'optional': true
          }
        ],
        'restParameter': {
          'name': 'args',
          'type': true
        },
        'returnType': { $ref: '#/properties/c/$typeParameters/C' }
      },
    },

    'd': {
      '$tsType': 'function',
      '$typeParameters': {
        'T': {}  // Assumes anyAsTrue option set to false
      },
      '$functionSignature': {
        'name': 'Hello',
        'parameters': [
          {
            'name': 'a',
            'type': true
          }
        ],
        'returnType': { $ref: '#/properties/d/$typeParameters/T' }
      }
    }
  }
};

All that said, I'm wrapping up a few last bits now in the build tools to keep the options and tags DRY.

It's a composite project with the plugin (TS compiler hook + LanguageService) separate, and I plan to make the code accessible before I do the plugin aspect, so everyone here can start testing and having a look!

As for when - I don't want to put my foot in my mouth again, but I can confidently say, very soon! I'm still working away. I'll keep everyone updated here!

domoritz commented 4 years ago

That's exciting @nonara. Do you support all test cases in https://github.com/vega/ts-json-schema-generator/tree/master/test? If not, what are the differences?

nonara commented 4 years ago

@domoritz Yes. All tests are supported! 🙂

domoritz commented 4 years ago

Awesome. I am excited to take your tool for a spin.