microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.68k stars 12.44k forks source link

Would be great if multiline strings were categorized differently #2005

Closed basarat closed 9 years ago

basarat commented 9 years ago

As it stands, the default classification looks horrible

image

I would be happy if there was just InMultiLineString (notice I don't say Literal) + it gets classified as MultilineString

Current:

    const enum EndOfLineState {
        Start = 0,
        InMultiLineCommentTrivia = 1,
        InSingleQuoteStringLiteral = 2,
        InDoubleQuoteStringLiteral = 3,
    }
    enum TokenClass {
        Punctuation = 0,
        Keyword = 1,
        Operator = 2,
        Comment = 3,
        Whitespace = 4,
        Identifier = 5,
        NumberLiteral = 6,
        StringLiteral = 7,
        RegExpLiteral = 8,
    }

Note: feel free to call it something other than MultiLineString but give us something.

DanielRosenwasser commented 9 years ago

1744 has a fix for this.

basarat commented 9 years ago

@DanielRosenwasser perhaps you can quickly tell me what classifyKeywordsInGenerics is supposed to do. Just a hint on if I should keep it true or false

basarat commented 9 years ago

@DanielRosenwasser also is there some public API endpoint to get the syntacticClassifier?

Nevermind, got the answer: https://github.com/Microsoft/TypeScript/issues/1477#issuecomment-66904683

DanielRosenwasser commented 9 years ago

@basarat if you'll see my change, it's been renamed to syntacticClassifierAbsent.

Before the syntactic classifier was a thing, the lexical classifier had to do a bunch of hacky sort of tricks to try to emulate correct behavior.

Basically, you don't want to accidentally classify number as a keyword as a type parameter when it might just be a plain identifier after a less-than token. The syntactic classifier will overwrite this to be the correct thing (i.e. an identifier) and so you'll get flashing. However, if there is no syntactic classifier, the lexical classifier should use its awful tricks again.

Actually, you're using web workers for Atom right? Are you using the other classifiers?

basarat commented 9 years ago

you're using web workers for Atom right?

Nope. A child process : https://github.com/TypeStrong/atom-typescript/blob/master/CONTRIBUTING.md#architecture . Node-webworker's need C/C++ toolchain on the client to install so I just went with a child process to make it easier to install.

Are you using the other classifiers?

nope. Just a lexical classifier : https://github.com/TypeStrong/atom-typescript/blob/master/lib/main/atom/typescriptGrammar.ts#L47, so I'll make it true. Thanks :)

basarat commented 9 years ago

Reason for not using webworkers : https://github.com/atom/atom-shell/issues/797

DanielRosenwasser commented 9 years ago

I just realized, you're asking for a multiline string classification. What if it's just a string literal classification but it's done correctly?

basarat commented 9 years ago

In a single line it gets identitified as an identifier :

image

In mutliple line it gets identified as code as there is no finalLexState for stuff with "`"

image

So for both of these I would like "just a string literal classification but it's done correctly".

basarat commented 9 years ago

@DanielRosenwasser fixed? Or wont-fix :heart:

DanielRosenwasser commented 9 years ago

If we do this I don't have to buy you chocolates for Valentine's Day. :heart:

(Will fix, but unfortunately just for 1.5: #2026. I guess I'll reopen just to track this as an issue.)

DanielRosenwasser commented 9 years ago

The fix is in master.