microsoft / tsdoc

A doc comment standard for TypeScript
https://tsdoc.org/
MIT License
4.75k stars 131 forks source link

RFC: Syntax for "declaration references" (hyperlinks) #9

Open octogonz opened 6 years ago

octogonz commented 6 years ago

Certain tags such as {@inheritdoc} contain references to other API items. AEDoc supports two notations for API item references:

This might be a reasonable starting point. However, it doesn't consider nesting members such as namespaces (which has turned out to be surprisingly common for legacy SDKs, despite all the downsides and attempts to deprecate them).

Also, AEDoc today overlooks an [interesting edge case](http://www.typescriptlang.org/play/#src=export%20class%20A%20%7B%0D%0A%20%20%20%20public%20b%3A%20string%20%3D%20'hello'%3B%0D%0A%7D%0D%0A%0D%0Aexport%20namespace%20A%20%7B%0D%0A%20%20%20%20export%20let%20b%3A%20number%20%3D%20123%3B%0D%0A%7D%0D%0A%0D%0Alet%20a%20%3D%20new%20A()%3B%0D%0A%0D%0A%2F%2F%20a.b%20is%20a%20string%0D%0A%2F%2F%20A.b%20is%20a%20number):

export class A {
    public b: string = 'hello';
}

export namespace A {
    export let b: number = 123;
}

let a = new A();

// a.b is a string
// A.b is a number

In this example, an expression such as {@link A.b} would be ambiguous. Which declaration is it referring to? How to handle this?

DovydasNavickas commented 6 years ago

Me and @MartynasZilinskas have a solution to this. We'll post more info a bit later today / tomorrow. We also have it working in ts-extractor's rewritten version according to what we talked about privately with you (ids, aforementioned ambiguity for same name, different types and not requiring JSON output). We also seem to have solved "declarations vs symbols" thingy and already extract both and also deal with them ids-wise.

P.S. You called hyperlinks what we call ids, but I'd rather we call them ids or identifiers, because we can use them in various situations, not only hyperlinks and they have to be unique and point to something specific, which is identifier.

aciccarello commented 6 years ago

How should internal references be looked up? TypeDoc isn't limited to a single API export so you could have multiple BaseWidgets that could match. Imports could give you a hint but I don't think we want the documentation parser to have to read those and it wouldn't work in all cases.

octogonz commented 6 years ago

@DovydasNavickas I've retitled this from "hyperlinks" to "API item references." This clarifies that we're not just talking about clickable hyperlinks. The name "ids" seemed a little too vague.

How should internal references be looked up? TypeDoc isn't limited to a single API export so you could have multiple BaseWidgets that could match.

Option 1: Require the names to be exported

Today API Extractor only allows references to exported items, using the official exported name. Today it doesn't support multiple entry points, but it would be fairly straightforward.

For example, if an API consumer has to write this:

import { Button } from '@scope/my-library/ui/controls';

Then the {@link} tag for an external consumer could look like this:

/**
 * {@link @scope/my-library/ui/controls:Button.render | the render() method}
 */

So for a local reference inside of @scope/my-library, we might expect it to disambiguate the entry point like this:

/**
 * {@link /ui/controls:Button.render | the render() method}
 */

Note that the TSDoc layer would not be directly involved in this. The @microsoft/tsdoc library would simply provide a standard notation and parser, which might return an AST node like this:

// {@link @scope/my-library/ui/controls:Button.render | the render() method}

{
  "nodeKind": "linkNode",
  "apiItemReference": {
    "scope": "@scope",
    "library": "my-library",
    "modulePath": "/ui/controls",
    "itemPath": "Button.render"
  },
  "linkText": {
    "nodeKind": "textNode",
    "content": "the render() method"
  }
}

Option 2: Rely on the compiler symbols

Another option would be to rely on the compiler symbols. This is fairly straightforward with today's compiler API, but only if the symbol is actually imported. Example:

import { Button as RenamedForSomeReason } from '../../ui/controls';

/**
 * {@link RenamedForSomeReason.render | the render() method}
 */
MartynasZilinskas commented 6 years ago
// index.ts file in `example-package`

// example-package/index:Foo#class
class Foo {
    // example-package/index:Foo#class.a#property
    public a: string;
}

// example-package/index:Foo#namespace
namespace Foo {
    // example-package/index:Foo#class.a#variable
    const a: number;
}

We ugpraded the format for specifying api item. {package-name}/{path-to-file}:{selector}#{kind}

I am thinking of adding an additional ApiItem symbol. It will respresent all declarations behind that symbol. For example:

"example-package/index:Foo": {
    "members": [
        "example-package/index:Foo#class", 
        "example-package/index:Foo#namespace"
    ]
}

When resolving id without specific kind part and that symbol has only single declaration, than Symbol ApiItem will act as alias to that single declaration.

// `example-package` at index.ts
namespace Foo {
    const a: number;
}

class Foo {
    /**
     * Property `a`.
     */
    public a: string;
}

// `other-package`
class Bar extends Bar{
    /**
     * {@inheritdoc example-package/index:Foo#class.a}
     */
    public a: string;
}

Specifying kind for Foo is mandatory because we have symbol that has two declarations.

The only problem I see is overloads, because they have same name and kinds.

P.S. Working delimiters !@#%^&*)+?;,..

octogonz commented 6 years ago

I like this idea of using # to disambiguate merged declarations. I was originally considering proposing something like :: vs . from C++ to distinguish static versus non-static members. But it would still leave open the question of how to refer to Foo as a namespace versus Foo as a class, an API reference site would probably want separate manual pages for these manifestations.

But I don't think you would ever need the #variable part. The a is unambiguous in example-package/index:Foo#class.a.

Property getters/setters are another example of multiple declarations for the same symbol, but I feel fairly strongly they should be treated as a single API item.

Overloads are an interesting case. C# /// comments would use a notation like {@link Widget.doSomething(string, string)} but that requires a full type system to resolve, and could be very verbose in TypeScript.

However, TypeScript requires all of the overloaded signatures to be right next to each other. They cannot be merged from different files. Also the ordering matters (since it determines matching precedence). So for this case, perhaps we could simply number them?

class Widget {

  // This is {@link Widget.doSomething#1}
  doSomething(name: string, age: number): void;

  // This is {@link Widget.doSomething#2}
  doSomething(options: { name: string, age: number }): void;

  // This is {@link Widget.doSomething#3}
  doSomething(): void;

  // This is the private implementation
  doSomething(whatever?: any, age?: number): void {
  }

}

It's not super elegant... but then neither are overloads. We don't use overloads all that much since the reflection check has a perf overhead, and the behavior gets weird with subclassing and such. It is a half-baked feature compared to languages such as Java or C++. API Extractor doesn't even give them separate manual pages; we treat overloads as a single function with a complicated type signature.

For people who really care about this, a more elaborate solution might be to provide a way to create named monikers. Something like this:

class Widget {

  /** @overload nameAndAge */
  doSomething(name: string, age: number): void;
  /** @overload options */
  doSomething(options: { name: string, age: number }): void;
  /** @overload none */
  doSomething(): void;

  doSomething(junk?: any, age?: number): void {
  }

}

/** The {@link Widget.doSomething#nameAndAge} is the first overload.

Any better ideas?

MartynasZilinskas commented 6 years ago

I upgraded ts-extractor to use this format of ids for ast items.

{package-name}/{path-to-file}:{selector}#{kind}&{itemCounter}

Code sample:

export function pickCard(x: { suit: string; card: number }[]): number;
export function pickCard(x: number): { suit: string; card: number };
export function pickCard(x: any): number | { suit: string; card: number } | undefined {
    const suits = ["hearts", "spades", "clubs", "diamonds"];
    // Check to see if we're working with an object/array
    // if so, they gave us the deck and we'll pick the card
    if (typeof x == "object") {
        let pickedCard = Math.floor(Math.random() * x.length);
        return pickedCard;
    } else if (typeof x == "number") {
        // Otherwise just let them pick the card
        let pickedSuit = Math.floor(x / 13);
        return { suit: suits[pickedSuit], card: x % 13 };
    }
}

This how ast item ids look like under the hood.

@simplrjs/package-name/index.ts // SourceFile
@simplrjs/package-name/index.ts:pickCard // AstSymbol
@simplrjs/package-name/index.ts:pickCard#function&1 // AstFunction
@simplrjs/package-name/index.ts:pickCard#function&1.x // AstSymbol
@simplrjs/package-name/index.ts:pickCard#function&1.x#parameter&1 // AstParameter
@simplrjs/package-name/index.ts:pickCard#function&2 // AstFunction
@simplrjs/package-name/index.ts:pickCard#function&2.x // AstSymbol
@simplrjs/package-name/index.ts:pickCard#function&2.x#parameter&1 // AstParameter
@simplrjs/package-name/index.ts:pickCard#function&3 // AstFunction
@simplrjs/package-name/index.ts:pickCard#function&3.x // AstSymbol
@simplrjs/package-name/index.ts:pickCard#function&3.x#parameter&1 // AstParameter

We are using a flat Map for items registry, so items ids need to be unique.

But I don't think you would ever need the #variable part. The a is unambiguous in example-package/index:Foo#class.a.

If id referencing to a symbol and it only has 1 declaration, then you don't need to specify #{kind} part.

Simple example

/**
 * Says Hello world!
 */
export const a: string = "Hello World!";

/**
 * {@inheritdoc @simplrjs/package-name/index.ts:a}
 */
export const b: string = "-";

Example

export namespace A.B.C.D.E {
    /**
     * Says Hello world!
     */
    export const a: string = "Hello World!";
}

/**
 * {@inheritdoc @simplrjs/package-name/index.ts:A.B.C.D.E.a}
 */
export const b: string = "-";

However, TypeScript requires all of the overloaded signatures to be right next to each other. They cannot be merged from different files. Also the ordering matters (since it determines matching precedence). So for this case, perhaps we could simply number them?

If you need to inherit documentation from a specific overload, numbering is the way to go.

DovydasNavickas commented 6 years ago

@pgonzal said:

But I don't think you would ever need the #variable part. The a is unambiguous in example-package/index:Foo#class.a.

@MartynasZilinskas said:

If id is referencing a symbol and it only has 1 declaration, then you don't need to specify the #{kind} part.

I think we have situation, where we have two different goals:

But:

What I'm a bit more careful about is having those optional parts omitted and reference becoming ambiguous later on in the development, when say another declaration is added to the symbol (e.g. inherit doc or a link gets ambiguous).

On the other hand, that might even become a perk. If the reference becomes ambiguous later on, you might need to update the reference to a more specific one either way, so this could be a good signal that you really need to take a look into that older part and specify what exactly are you referring to.

Thoughts?

yume-chan commented 6 years ago

For the overload part, I strongly disagree using magic numbers to label them. If the performance is a concern, the doc generator may produce extra metadata to help distinguish them (by index, for example) for other tools, but the developer must have the ability to reference overloads by their signatures.

The #{type} syntax looks strange, how about just (class)Foo.(property)a? I think a pair of parentheses may not cause syntactic ambiguities. Or maybe <class>Foo.<property>a like a TypeScript cast operator.

MartynasZilinskas commented 6 years ago

@yume-chan The concern is not with performance, but how verbose can it become, like @pgonzal pointed out. The order of overload signatures matters, so numbering them makes sense.

Consider this example:

class Widget {
    doSomething(name: string, age: number): void; // Widget.doSomething(string, number)
    doSomething(options: { name: string; age: number }): void; // Widget.doSomething({ name: string; age: number })    
    doSomething(): void; // Widget.doSomething()
}

How reference will look like for this overload?

doSomething(options: { name: string; age: number }): void;

I see the real problem when you have whole bunch of overloads (example). In this case, labeling would be convenient.

Usage:

@label <name>

Example:

class Widget {
    /**
     * Do something with name and age.
     * @label nameAndAge
     */
    doSomething(name: string, age: number): void; // Widget.doSomething#nameAndAge or Widget.doSomething&1
    doSomething(options: { name: string; age: number }): void; // Widget.doSomething&2
    /**
     * No arguments needed to do something.
     */
    doSomething(): void; // Widget.doSomething&3
}

class CustomWidget {
    /**
     * {@inheritdoc Widget.doSomething#nameAndAge}
     */
    doSomething(name: string, age: number): void;
    /**
     * {@inheritdoc Widget.doSomething&3}
     */
    doSomething(): void {}
}

All possible naming variants so far:

What do you think about @label @pgonzal? I think it represents better what it means than @overload.


The #{type} syntax looks strange, how about just (class)Foo.(property)a? I think a pair of parentheses may not cause syntactic ambiguities. Or maybe <class>Foo.<property>a like a TypeScript cast operator.

The #{kind} you're referring to is not a type. The idea here is to specify which syntax kind it represents when more than one declaration is available under the same symbol name. Plus, in ts-extractor we want to use these references as unique identifiers in AST items registry.

yume-chan commented 6 years ago

How reference will look like for this overload?

I now understand the problem is the complex type system in TypeScript. I came from C# and I have used to using signature to reference overloads.

labeling would be convenient.

I think label is a good idea.

The #{kind} you're referring to is not a type.

I know, I just used the word "type". But class is a function and namespace is an object, they are still "types", aren't them? Anyway, I suggest a new syntax to distinguish different node kinds by a TypeScript style type conversion operator. It's more familiar with TypeScript programmers and because the kind is in the front of name, it's more easily to read and understand.

DovydasNavickas commented 6 years ago

But class is a function and namespace is an object, they are still "types", aren't them?

Well, in TypeScript's realm they are declarations, not types. And their common symbol is what binds them.

I suggest a new syntax to distinguish different node kinds by a TypeScript style type conversion operator.

I think this might be a confusing analogy. Also, if we write (class)Foo and (namespace)Foo, it is in the wrong order semantically, because a symbol Foo might have two declarations: class and namespace. And therefore this disambiguation should be specified later in the reference:

Foo(class)
Foo(namespace)

But then it looks weird for me as it resembles a function call. And # is something that specifies things in urls and is more likely to be understood more naturally:

Foo#class
Foo#namespace

And if you define reference for property myProp in a class Foo, it would look:

Foo#class.myProp

Which goes as Foo#class and dot-notation into its members. If we look at Foo#class as a single unit and for the sake of argument name it as FooClass, this reference would become:

FooClass.myProp

And this notation is more than familiar for anyone having done programming.

Also, if we adopt the @label tag, we can forbid reserved keywords usage for labels, such as class and namespace, but let people use other strings without spaces, e.g. magical-overload. Then we could use the same # notation with labels for code:

class Foo {
   public myProp: string;
    /**
     * @label magical-overload
     */
   public myMethod(a: string): void;
   public myMethod(): void;
}

This would be the references:

Foo#class.myProp                    // <-- Property
Foo#class.myMethod#magical-overload // <-- Specific overload with a label `magical-overload`
octogonz commented 6 years ago

@DovydasNavickas how would your proposed notation handle these two toString methods?


class Vector {
  public x: number;
  public y: number;

  public toString(): string {
    return `${this.x}, ${this.y}`;
  }

  public static toString(x: string, y: string): string {
    return `${x}, ${y}`;
  }
}
MartynasZilinskas commented 6 years ago

We discussed with @DovydasNavickas offline about handling static members.

#{kind/label} format is no longer sufficient to specify only TypeScript syntax kinds and JSDoc @label tags. We think of allowing multiple keywords that could be added to the symbol name to specify all required details: #{kind/label/scope}.

// Package: simple-package

class Vector {
    public x: number;
    public y: number;

    // Full id: simple-package:Vector#Class.toString#ClassMethod
    public toString(): string {
        return `${this.x}, ${this.y}`;
    }

    // Full id: simple-package:Vector#Class.toString#Static#ClassMethod
    public static toString(x: string, y: string): string {
        return `${x}, ${y}`;
    }
}

class Vector2 {
    /**
     * Used relative id.
     * @inheritDoc Vector.toString#Static#ClassMethod
     */
    public static toString(x: string, y: string): string {
        return `${x}, ${y}`;
    }
}

Static method toString has two keywords:

In this case, we need to specify declaration's syntax kind, because otherwise, we won't know if developer might've made a mistake.

class Vector {
    public x: number;
    public y: number;

    // Full id: simple-package:Vector#Class.toString#ClassMethod
    public toString(): string {
        return `${this.x}, ${this.y}`;
    }

    // [2] Full id: simple-package:Vector#Class.toString#Static#ClassMethod
    public static toString(x: string, y: string): string {
        return `${x}, ${y}`;
    }
}

class Vector2 {
    /**
     * [1] Used relative id.
     * @inheritDoc Vector.toString 
     */
    public toString(): string {
        return `${this.x}, ${this.y}`;
    }
}

Should we throw an error about ambiguous identifier on [1] when the static method with the same name is later added at [2]?

This is the only shortcoming that we see at the moment.

MartynasZilinskas commented 6 years ago

In the newest prototype of ts-extractor, I encountered a few problems while generating AstItem's ids. I got more mature ideas about id format to be used.

The format that will be used in linking declarations doesn't change. {package-name}/{path-to-file}:{selector}:{types}

There will be 3 types of separators:

[:] Semicolon separator

Separator between location, declaration selector and type.

// Package: "package"
// File: "components.ts"

// Symbol: "TextField"
// Declaration: "Class"
export class TextField {}
Id:
package/components:TextField#Class

Parts:
package/components : TextField#Class
{package}          : {declarations}     

Separator between declaration selector and type selector.

// Package: "package"
// File: "helpers.ts" 

// Symbol: "resolveId"
// Declaration: "Function"
// ReturnType: "TypeBasic" -> string
export function foo(a: any): string;
Id:
package/helpers:foo#Function:TypeBasic

Parts:
package/helpers : foo#Function  : TypeBasic
{package}       : {declaration} : {type}

This id refers to foo function return type. It will not be used in hyperlinks, because we don't need to point to anything more specific than actual function. Return type can also change more often than function name, which means previous hyperlinks would become incorrect. To distinguish overloads, we'll use labels, but more on that later.

[.] Dot separator

Separate symbols

// Package: "package"
// File: "helpers.ts" 

// Symbol: "ComponentHelpers"
// Declaration: "Namespace"
export namespace ComponentHelpers {

    // Symbol: "foo"
    // Declaration: "Function"
    export function foo(a: any): string;
}
Id:
package/helpers:ComponentHelpers#Namespace.foo#Function

Parts:
{parent}                   . {child}
ComponentHelpers#Namespace . foo#Function

Separate types

// Package: "package"
// File: "helpers.ts" 

// Symbol: "Foo"
// Declaration: "TypeAlias"
// Type: "UnionType" -> `string | number`
export type Foo = string | number;
Id:
package/helpers:Foo#TypeAlias:UnionType.TypeBasic#1

Parts:
{parent}   .   {child}
UnionType  .   TypeBasic#1

[#] Hashtag separator

We can use different casings to distinguish hashtags:

Ordering could also be strict:

Simple example

// Package: "package"
// File: "components.ts"

// Symbol: "Foo"
// Declaration: "Class"
export class Foo {}
Id:
package/components:Foo#Class

Parts:
Foo      #Class
{name}   #{kind}

With static scope

// Package: "package"
// File: "components.ts"

// Symbol: "Foo"
// Declaration: "Class"
export class Foo {
    // Symbol: "bar"
    // Scope: "Static"
    // Declaration: "ClassProperty"
    public static bar: string;
}
Id:
package/components:Foo#Class.bar#Static#ClassProperty

Parts:
bar     #Static    #ClassProperty
{name}  #{scope}   #{kind}     

With Static scope and method overloads

// Package: "package"
// File: "components.ts"

// Symbol: "Foo"
// Declaration: "Class"
export class Foo {
    // Symbol: "bar"
    // Scope: "Static"
    // Declaration: "ClassProperty"
    // Counter: 0
    public static bar(x: string): string;    
    // Symbol: "bar"
    // Scope: "Static"
    // Declaration: "ClassProperty"
    // Counter: 1
    public static bar(): string;
}
Id:
package/components:Foo#Class.bar#Static#ClassMethod#0

Parts:
bar     #Static    #ClassMethod    #0  
{name}  #{scope}   #{kind}         #{counter}

Overload with a label

// Package: "package"
// File: "components.ts"

// Symbol: "Foo"
// Declaration: "Class"
export class Foo {
    // Symbol: "bar"
    // Scope: "Static"
    // Declaration: "ClassProperty"
    // Counter: 0
    public static bar(x: string): string;    
    // Symbol: "bar"
    // Scope: "Static"
    // Declaration: "ClassProperty"
    // Counter: 1
    // Label: "simple-label"
    /**
     * @label simple-label
     */
    public static bar(): string;
}
Id:
package/components:Foo#Class.bar#simple-label

Parts:
bar     #simple-label  
{name}  #{label}
octogonz commented 6 years ago

I find the notation here to be a little counterintuitive:

package/components:Foo#Class.bar#simple-label

Since we generally use InitialCaps for class names, the #Class part seems like an identifier rather than a system-defined kind. Also, since keywords like "class" or "static" normally precede the identifier in TypeScript declarations, in the expression Foo#Class.bar it seems like bar would be the class. We would ideally want the notation to feel like a suffix. The URL hashtag notation normally acts that way, but that sense might be lost in the middle of a big chain of symbols.

Here's a possible alternative since everyone expects [ ] to be a suffix operator:

@scope/my-package/components:Foo[class].member[static]
@scope/my-package/components:Foo[class].member[static,1]
@scope/my-package/components:Foo[class].member[static,SimpleLabel]

Ordering could also be strict:

  • #{kind}
  • #{kind}#{counter}
  • #{scope}#{kind}#{counter}

For the labels, if the developer is already doing the work to define labels to facilitate documentation generation, then they should be able to ensure that their labels are unique. Thus perhaps we would never need multiple hashtags? For example:

@scope/my-package/components:Foo[class].member[static]
@scope/my-package/components:Foo[class].member[static#1]
@scope/my-package/components:Foo[class].member[SimpleLabel]
MartynasZilinskas commented 6 years ago

I agree with using [ ] brackets.

yume-chan commented 6 years ago

Here's a possible alternative since everyone expects [ ] to be a suffix operator:

But what if the class does include a field called "class" or something not a valid JavaScript identifier?

octogonz commented 6 years ago

But what if the class does include a field called "class" or something not a valid JavaScript identifier?

For that case, I suppose quotation marks would be required and could disambiguate it:

class Example {
  public 'class': number;

  /**
   * Assigns the {@link Example['class']} member.
   */
  public assignClass() {
    this['class'] = 123;
  }
}

But is this a realistic documentation scenario?

Personally I have a clear boundary between "strange constructs that we sometimes need to get the job done" versus "typical constructs that I'd bother documenting". I'm open to other perspectives/counterexamples, though.

yume-chan commented 6 years ago

@pgonzal I think symbols will be a real use case, like Foo[Symbol.iterator] or some other user-defined symbols Foo[mySymbol].

And in my opinion, Foo[class] and Foo["class"] are still too similar and might cause problems for a user first see it.

octogonz commented 6 years ago

BTW I noticed that JSDoc calls this concept "namepaths": http://usejsdoc.org/about-namepaths.html

They use . for static members and # for non-static members, but that design seems a little underdeveloped for all the problems Martynas was tackling in his proposal above.

dend commented 6 years ago

In DocFX, we leverage the concept of UIDs - it would be interesting to see if that is something that we can plug in here as well, to make sure that writers/devs can easily leverage existing conventions.

octogonz commented 6 years ago

@MartynasZilinskas I am at the point where I need to implement a parser for the syntax discussed above. I'm going to make it part of the TSDoc library and spec. It sounds like we landed on essentially this:

@scope/my-package/components:Foo[class].member[static]
@scope/my-package/components:Foo[class].member[static#1]
@scope/my-package/components:Foo[class].member[SimpleLabel]

Did you have anything other corrections/enhancements that should be incorporated?

octogonz commented 6 years ago

In the newest prototype of ts-extractor, I encountered a few problems while generating AstItem's ids. I got more mature ideas about id format to be used.

BTW is this code that you could share? I looked around for it in the master branch but didn't see it.

MartynasZilinskas commented 6 years ago

@pgonzal I am developing on dev-next. It's a complete rewrite. Soon I will start implementing TSDoc.

Syntax looks good 👍 .

octogonz commented 6 years ago

After chatting with @RyanCavanaugh, we've decided to call this syntax a "declaration reference".

He expressed some concern about indexing based on "order of appearance in the source file", and favored the idea of defining explicit labels, such as the @overload tag above. He pointed out that the compiler's type matching for interfaces may not follow the order in which declarations appear.

octogonz commented 6 years ago

I worked through a bunch of interesting examples with the [] selector notation. It's too big to paste here, so I've created a PR to add it as a file under spec/code-snippets:

PR #64: [spec] Adding snippets for "declaration references"

Comments/feedback are welcome, even after the PR is merged. (The spec draft isn't ready yet, so this is just a guide to help me finish the parser prototype.)

@MartynasZilinskas @DovydasNavickas @yume-chan @aciccarello

octogonz commented 6 years ago

@yume-chan wrote:

@pgonzal I think symbols will be a real use case, like Foo[Symbol.iterator] or some other user-defined symbols Foo[mySymbol].

Could you provide a realistic example of this? I'm not 100% sure whether symbols are even part of the TypeScript type signatures. They seem more like metadata that can be attached to an object only at runtime.

And in my opinion, Foo[class] and Foo["class"] are still too similar and might cause problems for a user first see it.

I agree. I've updated the syntax to eliminate this ambiguity: Foo[class] is always some declaration corresponding to the semantic symbol Foo, whereas Foo.class or Foo."class" is always a different symbol that is a member of Foo.

octogonz commented 6 years ago

While working on the parser for @link tags, I encountered an ambiguity with the ":" delimiter. Recall that ":" separates the import path, like in these examples:

/**
 * {@link Button | link text}
 * {@link controls.Button | link text}
 * {@link controls[namespace].Button | link text}
 * {@link package:controls.Button | link text}
 * {@link package: | link text}  <-- linking to the package itself
 * {@link @scope/package:controls.Button | link text}
 * {@link @scope/package/path1/path2:controls.Button | link text}
 * {@link ../controls/Button:Button | link text}
 * {@link ./controls/Button:Button | link text}
 */

However, JSDoc also allows the @link target to be a URL instead of a declaration reference, and this is commonly used. Some examples:

/**
 * {@link http://example.com | link text}
 * {@link https://example.com | link text}
 * {@link //example.com | link text}
 * {@link mailto:John.Doe@example.com | link text}
 * {@link news:comp.infosystems.www.servers.unix | link text}
 * {@link tel:+1-816-555-1212 | link text}
 */

Problem: How to tell whether a string is a URL or a declaration reference? For HTTP URLs we can look for the "://" pattern, but e.g. mailto and news are valid NPM package names, and their URLs do not have "://" characters:

/**
 * Is this the "cancel" function in the "control" namespace from the "news" package?
 * {@link news:control.cancel | link text}
 * Is this the "webmaster" function from the "mailto" package?
 * {@link mailto:webmaster | link text}
 */

These are admittedly somewhat artificial examples, but as soon as you allow any scheme, who knows what other syntaxes might show up.

Possible solutions:

  1. Keep ":" as the package delimiter for declaration references, but restrict @link to only use URIs with "://" in them. If someone really needs a mailto: or tel: hyperlink, they can use an HTML <a> tag (which is already supported by TSDoc and CommonMark). OR

  2. Use a different delimiter such as "#". This way, if we see a colon after a (scheme-like) word, we can be sure the string is NOT a declaration reference.

For example, here's how it would look if we switch to "#" as the delimiter:

/**
 * {@link controls.Button | link text}
 * {@link package#controls.Button | link text}
 * {@link package# | link text}  <-- linking to the package itself
 * {@link @scope/package#controls.Button | link text}
 * {@link @scope/package/path1/path2#controls.Button | link text}
 * {@link ../controls/Button#Button | link text}
 * {@link ./controls/Button#Button | link text}
 */

Anyone have opinions? It's pretty subjective, but I think maybe I'm favoring option 1 (because the # delimiter looks like a URL hashtag, which makes me want to write {@link #Button} instead of {@link Button}, but that would lose the flavor of JSDoc).

DaSchTour commented 6 years ago

I like option 2 because it looks like URL hashtag. Like an anchorlink it points to a section of a bigger package/file.

nickpape commented 6 years ago

I like the usage of the pound symbol. It seems more readable and has the right intuitive meaning [i.e. because it looks like a URL hashtag].

MartynasZilinskas commented 6 years ago

@pgonzal Nice catch. Pound symbol looks good.

When I started rewriting examples above, I have encountered Type alias. Can you clarify in what format would it be written? Maybe we should write all available declarations at the moment to clarify the declaration kind format.


// Package: "package"
// File: "helpers.ts" 

// Symbol: "Foo"
// Declaration: "TypeAlias"
// Type: "UnionType" -> `string | number`
export type Foo = string | number;
(1) package/helpers#Foo[typeAlias]
(2) package/helpers#Foo[type]
yume-chan commented 6 years ago

@pgonzal wrote:

Could you provide a realistic example of this?

A quick search leads me to a well-known symbol: Symbol.replace.

From MDN

The Symbol.replace well-known symbol specifies the method that replaces matched substrings of a string. This function is called by the String.prototype.replace() method.

For more information, see RegExp.prototype[@@replace]() and String.prototype.replace().

You can see MDN uses RegExp.prototype[@@replace]() to refer a well-known Symbol member.

If I created a new Symbol to implement some pattern, and I have a reference implementation on some class, I need a method to link to it.


I'm not 100% sure whether symbols are even part of the TypeScript type signatures.

Symbols are definitely part of TypeScript type system. In lib.es2015.symbol.wellknown.d.ts, many built-in classes have Symbol members.


I found another problem: Symbols may introduce nested references:

/** I want to reference {@link MyClass[MySymbols[MyHiddenSymbols.value]] */
yume-chan commented 6 years ago

As a @link can link to either a URI or a declaration, will markdown links ([MyClass](Myclass)) supports this syntax as well?

octogonz commented 6 years ago

@MartynasZilinskas:

When I started rewriting examples above, I have encountered Type alias. Can you clarify in what format would it be written? Maybe we should write all available declarations at the moment to clarify the declaration kind format.

The selector would be [type]. The most complete set of worked out examples is in spec/code-snippets/DeclarationReferences.ts. It has an example of a type alias, but we can add more cases if there are questions.

As a @link can link to either a URI or a declaration, will markdown links ([MyClass](Myclass)) supports this syntax as well?

Let's discuss this as https://github.com/Microsoft/tsdoc/issues/70 since it's a separate topic.

I found another problem: Symbols may introduce nested references:

@yume-chan thanks for providing these links! I need to think about it a bit, then I'll follow up.

octogonz commented 6 years ago

@yume-chan thank you very much for bringing this to our attention! ECMAScript 6 symbols turned out to be a nontrivial monkey wrench for our proposed design, for a couple reasons:

Our team chatted about this for a couple hours today. We're now proposing to change the notation somewhat significantly as outlined in spec PR #71. Please take a look at that PR for full details.

But the basic idea is to replace this:

/**
 * OLD NOTATION
 * {@link my-package#Foo.member}
 * {@link my-package#Foo[class].member[static]}
 * {@link my-package#Foo[class].overloadedFunction[1]}
 * {@link my-package#Foo[class].overloadedFunction[CUSTOM_LABEL]}
 */

...with this:

/**
 * NEW PROPOSAL
 * {@link my-package#Foo.member}
 * {@link my-package#(Foo:class).(member:static)}
 * {@link my-package#(Foo:class).(overloadedFunction:1)}
 * {@link my-package#(Foo:class).(overloadedFunction:CUSTOM_LABEL)}
 */

And here's a full example of how ECMAScript 6 symbols get integrated into these expressions:

export namespace WellknownSymbolsM1 {
  /**
   * Shortest name:  {@link WellknownSymbolsM1.toStringPrimitive}
   * Full name:      {@link (WellknownSymbolsM1:namespace).(toStringPrimitive:variable)}
   */
  export const toStringPrimitive: unique symbol = Symbol();

  /**
   * Shortest name:  {@link WellknownSymbolsM1.toNumberPrimitive}
   * Full name:      {@link (WellknownSymbolsM1:namespace).(toNumberPrimitive:variable)}
   */
  export const toNumberPrimitive: unique symbol = Symbol();
}

export class ClassM2 {
  /**
   * Shortest name:  {@link ClassM2.[WellknownSymbolsM1.toStringPrimitive]}
   * Full name:      {@link (ClassM2:class).([(WellknownSymbolsM1:namespace).(toStringPrimitive:variable)]:instance)}
   */
  public [WellknownSymbolsM1.toStringPrimitive](hint: string): string {
    return '';
  }

  /**
   * Shortest name:  {@link ClassM2.([WellknownSymbolsM1.toNumberPrimitive]:instance)}
   * Full name:      {@link (ClassM2:class).([(WellknownSymbolsM1:namespace).(toNumberPrimitive:variable)]:instance)}
   */
  public [WellknownSymbolsM1.toNumberPrimitive](hint: string): string {
    return '';
  }

  /**
   * Shortest name:  {@link ClassM2.([WellknownSymbolsM1.toNumberPrimitive]:static)}
   * Full name:      {@link (ClassM2:class).([(WellknownSymbolsM1:namespace).(toNumberPrimitive:variable)]:static)}
   */
  public static [WellknownSymbolsM1.toNumberPrimitive](hint: string): string {
    return '';
  }
}

BTW I also noticed an slight oversight today: The real Symbol.toPrimitive is actually an ambient declaration, which we haven't considered in any of our examples so far. I'm going to work out the handling of ambient declarations tomorrow.

Anyway, although this (:) notation is a fairly significant design change, people seemed to find the resulting expressions more readable and intuitive than before, which was reassuring. But I'm interested to hear other opinions. What do you think?

octogonz commented 6 years ago

@iclanton FYI

aciccarello commented 6 years ago

Thanks for being thorough in looking at this. As a library maintainer I am concerned that with all the parenthesis the syntax isn't easy enough to explain which could lead to frequent issues being opened wondering why the link didn't work.

octogonz commented 6 years ago

One reassurance is that across nearly all real world APIs that I have worked on, the only case where these parentheses would be needed is for overloaded functions. (The "full name" shown above is mainly for illustrative purposes.) Declaration merging is fairly uncommon in practice except when writing typings for a legacy library which was not designed with TypeScript in mind.

That said, thanks for sharing this concern. The {@link} tag is a central feature of TSDoc so I want to get this right. I'll set up another design meeting tomorrow to see if anyone can think of any other notations that might be more concise/intuitive, while still being a complete and accurate representation. (BTW I wanted to punt on support for symbols entirely, but the more we researched it, the more it seemed like an important case.)

Feel free to post ideas on this thread as well of course.

octogonz commented 6 years ago

I chatted about with this with several different people this morning, but there doesn't seem to be any obvious way to simplify the notation while still supporting merged declarations and symbols.

The most obvious simplification would be to eliminate the parentheses. Without them, the parsing usually remains unambiguous:

  We are NOT proposing to use this notation:
  /**
   * Shortest name:  {@link ClassM2.[WellknownSymbolsM1.toNumberPrimitive]:static}
   * Full name:      {@link ClassM2:class.[WellknownSymbolsM1:namespace.toNumberPrimitive:variable]:static}
   */

However in complicated expressions, people found the code to be less readable/intuitive without the parentheses, especially if you're unfamiliar with the notation. We discussed making the parentheses optional (e.g. so you could shorten Button.(overloadedMethod:WITH_STRING) to Button.overloadedMethod:WITH_STRING in simple cases where the context makes the meaning obvious). But people inconsistent usage of parentheses would make the notation harder to learn.

If we were to remove the parentheses, we would need to reconsider the handling of anonymous constructs. Consider (:STRING_INDEXER) from this example:

export interface InterfaceL1 {
  /**
   * Shortest name:  {@link InterfaceL1.(:STRING_INDEXER)}
   * Full name:      {@link (InterfaceL1:interface).(:STRING_INDEXER)}
   *
   * {@label STRING_INDEXER}
   */
  [key: string]: number;

  /**
   * Shortest name:  {@link InterfaceL1.(:NUMBER_INDEXER)}
   * Full name:      {@link (InterfaceL1:interface).(:NUMBER_INDEXER)}
   *
   * {@label NUMBER_INDEXER}
   */
  [key: number]: number;

  /**
   * Shortest name:  {@link InterfaceL1.(:FUNCTOR)}
   * Full name:      {@link (InterfaceL1:interface).(:FUNCTOR)}
   *
   * {@label FUNCTOR}
   */
  (source: string, subString: string): boolean;

  /**
   * Shortest name:  {@link InterfaceL1.(:CONSTRUCTOR)}
   * Full name:      {@link (InterfaceL1:interface).(:CONSTRUCTOR)}
   *
   * {@label CONSTRUCTOR}
   */
  new (hour: number, minute: number);
}

If anyone else wants to propose a better syntax, please feel free to do so. Otherwise for now we'll probably proceed with the current proposal. For the record I feel like a better solution must exist; I'm just not smart enough to find it. :-P

octogonz commented 6 years ago

BTW after working with this notation for a while, I'm coming to the opinion that making the parentheses optional is not such a bad idea, and is a pretty good mitigation for the psychological impression that @aciccarello mentioned. There's no urgency to decide that now, but I'll mention it as a possibility in the spec.

octogonz commented 5 years ago

Update: https://github.com/Microsoft/web-build-tools/issues/881#issuecomment-461622924 led to an interesting discussion about how to name overloaded methods in the web site docs. This problem is closely related to the declaration reference selectors.

Let us know your opinion -- compare these 3 alternatives:

A. require a unique list of parameter names

calculate(calculationType)

function calculate(calculationType: CalculationType): void;

calculate(calculationTypeString)

function calculate(calculationTypeString: "Recalculate" | "Full"): void;

OR:

B. label selectors

calculate:ENUM

function calculate(calculationType: CalculationType): void;

calculate:STRING

function calculate(calculationType: "Recalculate" | "Full"): void;

OR:

C. index selectors

calculate:1

function calculate(calculationType: CalculationType): void;

calculate:2

function calculate(calculationType: "Recalculate" | "Full"): void;

 

 

Here's some screenshots @AlexJerabek shared that show a little more context:

overload_current

Having the types in the name would alleviate this problem (and look like .NET APIs, like this one).

overload_dotnet

rix0rrr commented 5 years ago

I just had the intent of linking to a package, like so:

{@link @aws-cdk/aws-lambda-event-sources}

But it seems not allowed to link to an entire package, I must link to an element INSIDE the package?

The declaration reference appears to contain a package name or import path, but it is missing the "#" delimiter
320    * Event sources are implemented in the {@link @aws-cdk/aws-lambda-event-sources} module.
octogonz commented 5 years ago

The syntax should be:

{@link @aws-cdk/aws-lambda-event-sources#}

I agree this feels a little counterintuitive. But imagine your package was called maximum. Then the notation {@link maximum} can be ambiguous: What if the current package also exported a function called maximum()? Which one is it referring to? The # makes it clear you're linking to an entire package.

rix0rrr commented 5 years ago

I see.Thanks for the explanation!

Gerrit0 commented 5 years ago

Question - does the full name for interface members include :static?

The first example in the malformed names section includes it, other names do not:

https://github.com/microsoft/tsdoc/blob/100e0f0bcdeb0c61c7499ca6ea2cbf6a5a177555/spec/code-snippets/DeclarationReferences.ts#L388

It seems like it might be a typo:

https://github.com/microsoft/tsdoc/blob/100e0f0bcdeb0c61c7499ca6ea2cbf6a5a177555/spec/code-snippets/DeclarationReferences.ts#L284-L286

octogonz commented 5 years ago

That is a typo. Interface members cannot be static. Thanks for pointing this out! Maybe you could open a PR to fix it? I am traveling this week.

argv-minus-one commented 5 years ago

According to the AEDoc documentation:

TSDoc declaration references are always resolved relative to a specific entry point (NOT relative to the current source file or declaration scope). Thus, their syntax is independent of where the reference occurs within a given package. Since Widget.initialize appears inside Widget, we may want to shorten the reference to {@link render | the render() method}, but TSDoc standard does not support this.

This behavior makes very little sense, is non-DRY, and is highly astonishing because it's completely unlike how TypeScript works. It will cause lots of developer confusion. The obvious expectation is that {@link Widget} links to whatever is named Widget in the lexical scope the comment appears in, not to some unrelated thing named Widget somewhere else in the package.

Please reconsider.

Gerrit0 commented 5 years ago

I tend to disagree, TypeDoc currently uses the proposed behavior as it doesn't yet use TSDoc, and this makes it impossible to reference a global thing if you have a thing in your class.... this also results in TypeDoc spending roughly a quarter of the time spent documenting your code looking for references (yea, sure, there's definitely optimizations that could be done, but it is still going to be slower)

Linking based on package entry also helps make sure that my documentation doesn't change because I've imported some new function in a file (which might not even be public!)

octogonz commented 5 years ago

The obvious expectation is that {@link Widget} links to whatever is named Widget in the lexical scope the comment appears in, not to some unrelated thing named Widget somewhere else in the package.

@argv-minus-one I agree that this a somewhat obvious expectation. However, I don't know how we could realistically implement it without a lot of work.

For example, suppose I wrote something like this:

import Widget from 'widget-lib';

/**
 * Generates a new id for a {@link Widget}.
 */
export function generateWidgetId(): string;

This code produces an output .d.ts file that's missing the import statement, because the compiler thinks that nobody's using the Widget symbol in this file. That's a problem for API Extractor, as it analyzes the .d.ts file, not the .ts file. (There's some advantages to analyzing the API surface from the perspective of an external consumer.) Other tools may parse the .ts file, but they'll encounter the problem when they need to read TSDoc from an installed NPM dependency, since NPM packages typically install .d.ts files, not .ts files. Also, TSLint is going to whine about the unused import.

To address this, we could improve TypeScript and TSLint to understand the @link tag. Then they could parse the declaration reference and see that the Widget symbol is referenced. That solves @link. But here's a new problem: TSDoc is extensible. It allows declaration references to be used in custom inline tags that are tool-specific. To handle this, we need a setting somewhere (e.g. in package.json or tsdoc-metadata.json) that can define these custom tags. (That feature has been proposed already and is a good idea.)

Suppose we did all that stuff. There's some further consequences: API Extractor doesn't generate docs directly. Instead, it writes an .api.json file for each NPM package, so that API Documenter can come along and produce an integrated website with proper hyperlinking between all the packages. The .api.json stage also enables people to develop their packages in different Git repos. These JSON files contain the TSDoc and declaration references but don't have import statements, so they'd need to be rewritten with absolute references. API Extractor also writes a .d.ts rollup file, which does have import statements, but different ones from the local source files, so we'd want to rewrite those as absolute references as well. In short, wherever TSDoc moves between files, the file-relative references need to get rewritten. Most likely they need to get converted back to absolute references. (This implies that absolute references are a required feature, whereas file-relative references are optional.)

Okay but if we implemented all that stuff, file-relative would be doable, and it's probably a more intuitive for developers. By the time we got this working, there'd be a lot of legacy code using the old notation, so we maybe should define a special operator (let's say {@link ^Widget.render}) to indicate a reference to a local symbol. Sounds good to me! 😊

argv-minus-one commented 5 years ago

What about defining a syntax for declaration references in doc comments that's independent of TSDoc tags?

You might decide that anything surrounded by [[]] in a doc comment is taken to be a declaration reference with lexical scope, no matter where in the doc comment it appears. A TSDoc link would be written like [[Widget]], and any other tag that takes a declaration reference would also use this syntax, e.g. @throws [[BadWidgetError]].

TypeScript then knows that [[Widget]] is a reference to whatever is named Widget in that lexical scope, and therefore emits an import for it in the .d.ts, even if it is not otherwise needed. TypeScript doesn't need to know about tags, just declaration references.

There would need to be some other syntax to get different link text. The obvious choice is {@link [[Widget]] | different link text}, though another, terser option is making it resemble a Markdown reference-style link: [different link text][[Widget]]


I don't understand how the rollup issue applies only to declaration references, and not to rolling up declaration files in general. Suppose you have to roll this up:

// module1.d.ts

import { Widget } from "one-widget-package";
export declare class MyWidget extends Widget { … }
// module2.d.ts

import { Widget } from "another-widget-package";
export declare class MyOtherWidget extends Widget { … }

You can't just naïvely concatenate them:

// rollup.d.ts

import { Widget } from "one-widget-package";
export declare class MyWidget extends Widget { … }

import { Widget /* Name collision! */ } from "another-widget-package";
export declare class MyOtherWidget extends Widget /* Which Widget? */ { … }

Doesn't there already have to be some sort of disambiguation done to solve this problem? If so, why isn't that solution also applicable to declaration references? For example, if you disambiguate by renaming imports:

// module1.d.ts

import { Widget } from "one-widget-package";
/** It's a kind of [[Widget]]. */
export declare class MyWidget extends Widget { … }
// module2.d.ts

import { Widget } from "another-widget-package";
/** It's a different kind of [[Widget]]. */
export declare class MyOtherWidget extends Widget { … }
// rollup.d.ts

import { Widget } from "one-widget-package";
/** It's a kind of [[Widget]]. */
export declare class MyWidget extends Widget { … }

import { Widget as Widget2 } from "another-widget-package";
/** It's a different kind of [[Widget2]]. */
export declare class MyOtherWidget extends Widget2 { … }