microsoft / rushstack

Monorepo for tools developed by the Rush Stack community
https://rushstack.io/
Other
5.83k stars 592 forks source link

[api-extractor] Internal Error: The found child is not attached to the parent AstDeclaration #1874

Open moltar opened 4 years ago

moltar commented 4 years ago

Please prefix the issue title with the project name i.e. [rush], [api-extractor] etc.

Is this a feature or a bug?

Please describe the actual behavior.

The API Extractor fails with:

Internal Error: The found child is not attached to the parent AstDeclaration

If the issue is a bug, how can we reproduce it? Please provide detailed steps and include a GitHub branch if applicable. Your issue will get resolved faster if you can make it easy to investigate.

You can see the failure on this PR:

https://github.com/ScaleLeap/amazon-mws-api-sdk/pull/33

What is the expected behavior?

For the tool not to fail.

If this is a bug, please provide the tool version, Node.js version, and OS.

octogonz commented 4 years ago

The bug is introduced by this expression:

amazon-mws-api-sdk\lib\error.d.ts

export declare const MWSApiError: Codec<{
    ErrorResponse: {
        Error: {
            Type: string;
            Code: "InputStreamDisconnected" | "InvalidParameterValue" | "AccessDenied" | "InvalidAccessKeyId" | "SignatureDoesNotMatch" | "InvalidAddress" | "InternalError" | "Internal Error" | "QuotaExceeded" | "RequestThrottled" | "ResourceNotFound" | "ScheduledPackageAlreadyExists" | "RegionNotSupported" | "ScheduleWindowExpired" | "InvalidOrderState" | "PickupSlotNotAvailable" | "AccessToFeedProcessingResultDenied" | "ContentMD5Missing" | "ContentMD5DoesNotMatch" | "FeedCanceled" | "FeedProcessingResultNoLongerAvailable" | "FeedProcessingResultNotReady" | "InputDataError" | "InvalidFeedSubmissionId" | "InvalidFeedType" | "InvalidRequest" | "NonRetriableInternalError" | "RetriableInternalError" | "AccessToReportDenied" | "InvalidReportId" | "InvalidReportType" | "InvalidScheduleFrequency" | "ReportNoLongerAvailable" | "ReportNotReady" | "DependencyFatalException" | "DependencyRetriableException" | "DependencyUnauthorizedException" | "InternalErrorFatalException" | "InvalidInputFatalException";
            Message: string;
            Detail: string | undefined;
        };
        RequestID: string | undefined;
        RequestId: string | undefined;
    };
}>;
declare type MWSApiError = GetInterface<typeof MWSApiError>;

It is a real pretzel of type algebra (sigh, fellas, what are we doin' 😸). Somehow merging a declaration with a self-referential type has caused the compiler to create a ts.Declaration object that belongs to two different ts.Symbol objects. The code in AstSymbolTable._fetchAstSymbol() assumes that this is impossible. That causes a later sanity check to fail.

Simplifying the declaration makes the problem go away. I also found a couple places that use import() types, which are not yet supported (issue https://github.com/microsoft/rushstack/issues/1050). After removing those, API Extractor completed successfully.

This bug is an interesting little puzzle that will be fun to debug when I get some time. :-)

As an aside, it is my opinion that the type system is not being used wisely by this API. The pattern seems to be coming from the purify-ts package. From the examples on their website, Purify takes simple structures and converts them into very complicated TypeScript types. This may appeal to developers who like to solve some algebra problems while coding, but it is perhaps not the best choice for a public API that ordinary citizens are meant to consume. Just one person's opinion.

That said, API Extractor is expected to support this sort of API.

moltar commented 4 years ago

Thanks @octogonz for your deep investigation.

ping @gigobyte

This may appeal to developers who like to solve some algebra problems while coding, but it is perhaps not the best choice for a public API that ordinary citizens are meant to consume.

What would be then a good choice that solves these problems:

  1. Parsing third party API output
  2. Validating it
  3. And typing, without duplication of declarations

Thanks.

gigobyte commented 4 years ago

@moltar

We shouldn't be exposing the MWSApiError codec, only this type: declare type MWSApiError = GetInterface<typeof MWSApiError>;

What would be then a good choice that solves these problems

If this library fails on this type, I don't see how you can use any library that gives you a free type from the validation object. The check in purify is identical to io-ts and other libraries are no simpler: zod, typebox

@octogonz

This may appeal to developers who like to solve some algebra problems while coding, but it is perhaps not the best choice for a public API that ordinary citizens are meant to consume.

The result of the GetInterface type is an ordinary interface, consumers of the library have no way to notice the difference between it and a regular handwritten interface. I don't see your point here.

octogonz commented 4 years ago

The purify-ts website gives this example:

const User = Codec.interface({
  username: string,
  age: number
})

type User = GetInterface<typeof User> // <---- ?!

Maybe one way to reduce complexity would be to NOT merge the declarations:

const User = Codec.interface({
  username: string,
  age: number
})

type IUser = GetInterface<typeof User>

The TypeBox and zod libraries that you referenced both seem to take this approach in their examples.

octogonz commented 4 years ago

What would be then a good choice that solves these problems:

1. Parsing third party API output
2. Validating it
3. And typing, without duplication of declarations

If we want to generate an API docs website, a tool like API Extractor will expect to see something like this:

/**
 * Represents a user of the service
 */
interface IUser {
  /**
   * The name of the user
   */
  username: string;

  /**
   * The age of the user
   */
  age: number;
}

The interface is a stereotypical form. For example, the website template knows how to display interface inheritance, and it will show special sections for each of the member properties.

The GetInterface<typeof User> expression is theoretically equivalent to interface, but in practice it is not a recognizable form, so there isn't a straightforward way to write doc comments for it.

The purify-ts patterns are actually pretty well-behaved, so we could think of them as proprietary stereotypes. Maybe we could write an API Extractor plugin that recognizes GetInterface and can document it. But in general, non-stereotypical code can take unpredictable shapes and is challenging to document. This lack of structure can also make the API more difficult for newcomers to understand. (For example, I had read the Purify manual before I finally understood that string in the Purify codec is not JavaScript's string.)

A different way to solve your 3 requirements above might be to use a preprocessor tool. There are various possible approaches:

Whichever way we go, this preprocessor could produce a conventional interface declaration, with doc comments that API Extractor can parse to make an API website. And that might provide a better experience for other scenarios like VS Code IntelliSense, refactoring, etc.

TypeScript's type system is amazingly powerful and can solve many problems. But it's not always the best solution. Ultimately this is a subjective call that depends on the audience for your code. My own preference is to write for a very simpleminded audience, because when I'm in a hurry and trying to read someone else's code, I am the simpleminded one. :-)

octogonz commented 4 years ago

If this library fails on this type, I don't see how you can use any library that gives you a free type from the validation object.

@gigobyte The bug occurs because of something unusual about how type MWSApiError = GetInterface<typeof MWSApiError>; references itself and is also a merged declaration. API Extractor was able to process all the other codecs without any trouble.

gigobyte commented 4 years ago

@octogonz TypeScript is able to differentiate between values and types of the same name. In this case const MWSApiError and type MWSApiError are two different things and are neither self-referential nor a merged declaration.

The Codec module in purify-ts allows you to write a runtime validation of a type, which is why it's useful to be able to write it once and get the type for free. The alternative is:

const User: Codec<User> = Codec.interface({
  username: string,
  age: number
})

interface User {
  username: string,
  age: number
}

as you can see it's not exactly optimal to write this so basically all widely-used validation libraries provide a type that can generate the interface for you.

I had read the Purify manual before I finally understood that string in the Purify codec is not JavaScript's string.

Just a side note, JS doesn't have a global string variable, otherwise it would conflict with purify's exports. You are maybe thinking about the global String, Number, Boolean etc.

moltar commented 4 years ago

I am not too familiar with TS internals, but I think a useful handling of libraries like purify, io-ts and the like, for the cases of documentation, is to be able to "flatten" the complex object-like type/interface.

I am pretty sure TypeScript can already do that, because if you apply a complex type to a variable, the intellisense does not get confused and will hint correct props to you.

This is a similar class of a problem that I ran into with typedoc, where the resulting API documentation is not very useful, if you are using libraries like purify and io-ts.

IMO we will see more and more of these libraries and more use of them in the long run. So the tooling should be able to adapt and not get in the way.

I'm tracking most, if not all, libraries and benchmarks in my repo.

IMO this validate & type guard approach is the future of client-server relationship.

octogonz commented 4 years ago

TypeScript is able to differentiate between values and types of the same name. In this case const MWSApiError and type MWSApiError are two different things and are neither self-referential nor a merged declaration.

@gigobyte Consider this declaration:

export enum Color {
  Red
}

export namespace Color {
  export function parse(name: string): Color {
    return Color.Red;
  }
}

If we are being pedantic, we could try to say that when enum Color "merges" with namespace Color, it is only the value that merges; the enum type is not merging with anything. But when people talk about an enum they generally do not imagine the value and type to be divorced concepts. Nor do API Extractor and the compiler's own analysis model it that way. The namespace and the enum become a single ts.Symbol with two ts.Declarations, without regard to the type/value distinction.

Thus in the Purify example, it's fair to say that const User and type User do indeed undergo declaration merging.

This merging is legal, and it is a bug that API Extractor gets confused by it. We'll fix the bug. But this bug arises because the compiler's own internal analysis is rather complicated because the type is rather complicated. Complicated stuff tends to invite trouble.

The Codec module in purify-ts allows you to write a runtime validation of a type, which is why it's useful to be able to write it once and get the type for free. The alternative is:

const User: Codec<User> = Codec.interface({
  username: string,
  age: number
})

interface User {
  username: string,
  age: number
}

as you can see it's not exactly optimal to write this

Agreed. But this was not one of the 3 alternatives I suggested above. :-)

I had read the Purify manual before I finally understood that string in the Purify codec is not JavaScript's string.

Just a side note, JS doesn't have a global string variable, otherwise it would conflict with purify's exports. You are maybe thinking about the global String, Number, Boolean etc.

Sorry for the confusion. I meant that string in the Purify codec is not [the TypeScript type for] JavaScript's string.

octogonz commented 4 years ago

IMO we will see more and more of these libraries and more use of them in the long run. So the tooling should be able to adapt and not get in the way.

@moltar How should that work? Would you expect to write the /** */ doc comments inside the Codec.interface({ }) body? How would a documentation tool recognize that?

API Extractor parses .d.ts files (instead of .ts files), so it will not see doc comments unless the compiler writes them into the output .d.ts files. The .d.ts files represent the shipped API contract for a library, thus it's normally reasonable to assume that any important contractual information needs to end up there.

moltar commented 4 years ago

How would a documentation tool recognize that?

I am not sure, but VS Code does recognize that somehow, at least for io-ts and runtypes which I've tried. Haven't tried documenting with purify-ts yet.

octogonz commented 4 years ago

@moltar If you get a chance to investigate this and provide more technical details, please share what you find.

michkot commented 3 years ago

There is a pattern similar to the one in this issue: "as-const" enum-object + type merged to the same name, which seems to be logical replacement pattern for enums (because enums seem to be advised against). For example TypeDoc recently implemented a way to visualize them as enums. Is there any estimate on how much work is needed for api-extractor to not fail this pattern, which I hoped would be used quite a lot?

alexhwoods commented 4 months ago

We're also hitting this problem in our TS project, and we can't tell what set of statements is causing it.