microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.89k stars 12.47k forks source link

jsdoc: escaped property names are not parsed properly #54898

Open clshortfuse opened 1 year ago

clshortfuse commented 1 year ago

Bug Report

🔎 Search Terms

jsdoc escape property names

🕗 Version & Regression Information

⏯ Playground Link

Playground link with relevant code

💻 Code

/**
 * @see https://datatracker.ietf.org/doc/html/rfc7517#section-4
 * @typedef {Object} JWK
 * @prop {string} kty key type
 * @prop {string} [x5t\u0023S256] X.509 Certificate SHA-256 Thumbprint
 */

/** @type {JWK} */
const o = {
    kty: 'foo',
    'x5t#S256': 'bar',
};

🙁 Actual behavior

Errors: ]' expected. Type '{ kty: string; 'x5t#256': string; }' is not assignable to type 'JWK'. Object literal may only specify known properties, and ''x5t#256'' does not exist in type 'JWK'.

🙂 Expected behavior

No errors and to parse escaped property names correctly.

Additional Info

If \u0020 doesn't work on VSCode, file it as an issue with them. The way JSDoc works is to allow such escaping.

And that approach should indeed work with any character as "0020" is the hexadecimal Unicode escape sequence for a space. \u0023 is the code for # (familiarize yourself with Unicode escapes in JavaScript to find for others).

Originally posted by @brettz9 in https://github.com/jsdoc/jsdoc/issues/1468#issuecomment-743490820

Edit: Fixed typo in property name. Incidentally shows the importance of types. 😅

clshortfuse commented 1 year ago

In interest of moving this forward, I believe we need to analyze for \ tokens in

https://github.com/microsoft/TypeScript/blob/b1a96a37b5cb5dd18ca72cc09014c1f110089b1b/src/compiler/parser.ts#L8632

called from

https://github.com/microsoft/TypeScript/blob/b1a96a37b5cb5dd18ca72cc09014c1f110089b1b/src/compiler/parser.ts#L8636

a-tarasyuk commented 1 year ago

I think, the issue is not related to parsing Unicode names. The name contains a #, which is not a valid identifier (x#x). In TypeScript, a property name can be a StringLiteral

type a = {
    'x5t#S256': 1 
} 

const b: a = {
    'x5t#S256': 1,
}

https://github.com/microsoft/TypeScript/blob/b1a96a37b5cb5dd18ca72cc09014c1f110089b1b/src/compiler/parser.ts#L2672-L2675

However, when we define a property name in JSDoc, it can only be an Identifier or QualifiedName. https://github.com/microsoft/TypeScript/blob/b1a96a37b5cb5dd18ca72cc09014c1f110089b1b/src/compiler/parser.ts#L9664-L9680

I'm not sure if it's okay to allow the use of StringLiteral as property names in JSDoc.

/cc @RyanCavanaugh @sandersn

clshortfuse commented 1 year ago

@a-tarasyuk You seem to be right it's about what it computes to after parsing, not so much the parsing itself:

/**
 * @typedef MyObject
 * @prop {string} [\u0066\u006f\u006f] escaped foo
 */

/** @type {MyObject} */
const a = {
    foo: 'bar'
};

This parses fine. But \u0020 (`) can't be used in JSDocs, though it's accepted inRecord<?,?>`:

/**
 * @typedef {Record<'With Space', string>} WithSpaceObject
 */

/** @type {WithSpaceObject} */
const b = {
    'With Space': 'okay'
};
sandersn commented 1 year ago

Regardless of parsing, jsdoc should support the same kinds of property names as typescript itself, which means, I think, it should include StringLiteral. I haven't double-checked that against the code though.

I haven't even thought about how to parse things that aren't legal identifiers, since most params aren't optional and so don't surround the param name with brackets. Using \0020 instead of a literal space seems like it's parseable though.

a-tarasyuk commented 1 year ago

@sandersn If JSDocPropertyTag.name starts to support StringLiteral, it might be more appropriate for JSDocPropertyTag and JSDocParameterTag to use their types instead of inheriting from JSDocPropertyLikeTag (StringLiteral should be forbidden in JSDocParameterTag). Or is it okay to allow the use of StringLiteral in JSDocPropertyLikeTag? I'm not sure about the deprecation process for this case...

sandersn commented 1 year ago

Are you proposing adding an override type for JSDocPropertyTag.name? That seems reasonable, although I'm having trouble visualising the complete solution.

Changing JSDocPropertyTag.name to a supertype of its current type doesn't seem like a huge problem -- although, again, I could be missing something.

a-tarasyuk commented 1 year ago

@sandersn I have been considering two ways to accomplish that

  1. Remove the name from JSDocPropertyLikeTag and use it for JSDocParameterTag/JSDocPropertyLikeTag. Then, add conditions in parser/binder/etc. to manage that (for instance, the parser uses one method to handle both tags).
export interface JSDocPropertyLikeTag extends JSDocTag, Declaration {
    readonly parent: JSDoc;
    readonly typeExpression?: JSDocTypeExpression;
    /** Whether the property name came before the type -- non-standard for JSDoc, but Typescript-like */
    readonly isNameFirst: boolean;
    readonly isBracketed: boolean;
}

export interface JSDocPropertyTag extends JSDocPropertyLikeTag {
    readonly kind: SyntaxKind.JSDocPropertyTag;
    readonly name: JSDocPropertyName;
}

export interface JSDocParameterTag extends JSDocPropertyLikeTag {
    readonly kind: SyntaxKind.JSDocParameterTag;
    readonly name: EntityName;
}
  1. Remove JSDocPropertyLikeTag in favor of using all properties for each tag.
    
    export interface JSDocPropertyTag extends JSDocTag, Declaration {
    readonly kind: SyntaxKind.JSDocPropertyTag;
    readonly name: JSDocPropertyName;
    readonly parent: JSDoc;
    readonly typeExpression?: JSDocTypeExpression;
    /** Whether the property name came before the type -- non-standard for JSDoc, but Typescript-like */
    readonly isNameFirst: boolean;
    readonly isBracketed: boolean;
    }

export interface JSDocParameterTag extends JSDocTag, Declaration { readonly kind: SyntaxKind.JSDocParameterTag; readonly name: EntityName; readonly parent: JSDoc; readonly typeExpression?: JSDocTypeExpression; /* Whether the property name came before the type -- non-standard for JSDoc, but Typescript-like / readonly isNameFirst: boolean; readonly isBracketed: boolean; }


Both approaches involve deprecating or removing the public API, and I'm not sure which one would be safer.
sandersn commented 1 year ago

How about a variant of the first option: what about leaving JSDocPropertyLikeTag.name, but with type JSDocPropertyName (assuming JSDocPropertyName is a supertype of EntityName).

I'm not sure I like StringLiteral for parsing identifiers with otherwise-illegal characters. Is that what you're proposing? Is there precedent for that? Maybe it would be better to make a new kind of identifier.

Also, it's kind of weird that# is used for class-property access like C#prop, so maybe it's worth asking whether that specifically should be allowed, and whether there are any other specific uses that people have wanted and found missing.

clshortfuse commented 1 year ago

I can think of Amazon's DynamoDB syntax over JSON that uses # for ExpressionAttributeNames:

An expression attribute name must begin with a pound sign (#), and be followed by one or more alphanumeric characters.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ExpressionAttributeNames.html

I've also seen #text as a key, mostly when working with XML or HTML nodes.

If you mean what other characters, JSON-LD uses @id (\u0040) and : which also isn't supported by Typescript's JSDoc parser (edit):

https://www.w3.org/TR/json-ld/

MasonM commented 1 month ago

Ran across this too, and I found a workaround using inline intersection types (playground link):

/**
 * @see https://datatracker.ietf.org/doc/html/rfc7517#section-4
 * @typedef {Object} JWKBase
 * @prop {string} kty key type
 * 
 * @typedef {JWKBase & { ['x5t#S256']: string }} JWK
 */

/** @type {JWK} */
const o = {
    kty: 'foo',
    'x5t#S256': 'bar',
};

Messy, but it works!