"Uncaught TypeError: Cannot read property '0' of undefined" when switching to package-provided grammar

p-e-w commented 9 years ago

I'm getting this error every time I switch to the grammar "JavaScript (Semantic Highlighting)" provided by https://github.com/p-e-w/language-javascript-semantic in Atom 0.206.0 on Linux with --include-deprecated-apis (also, the "Copy error report to clipboard" button does nothing, but that's likely an unrelated issue). In a previous version of Atom (not sure which one exactly), the grammar worked without problems.

This problem is currently preventing me from fixing the deprecations in the language-javascript-semantic package.

/usr/share/atom/resources/app.asar/src/tokenized-line.js:89

TypeError: Cannot read property '0' of undefined
    at TokenizedLine.module.exports.TokenizedLine.transformContent (/usr/share/atom/resources/app.asar/src/tokenized-line.js:89:38)
    at new TokenizedLine (/usr/share/atom/resources/app.asar/src/tokenized-line.js:71:12)
    at TokenizedBuffer.module.exports.TokenizedBuffer.buildTokenizedLineForRowWithText (/usr/share/atom/resources/app.asar/src/tokenized-buffer.js:521:14)
    at TokenizedBuffer.module.exports.TokenizedBuffer.buildTokenizedLineForRow (/usr/share/atom/resources/app.asar/src/tokenized-buffer.js:506:19)
    at TokenizedBuffer.module.exports.TokenizedBuffer.tokenizeNextChunk (/usr/share/atom/resources/app.asar/src/tokenized-buffer.js:286:43)
    at /usr/share/atom/resources/app.asar/src/tokenized-buffer.js:263:26
    at /usr/share/atom/resources/app.asar/node_modules/underscore-plus/node_modules/underscore/underscore.js:666:47

kevinsawicki commented 9 years ago

/cc @nathansobo

nathansobo commented 9 years ago

This package uses private APIs that have changed to reduce memory consumption, similar to the package discussed in #6982. The fix will have to occur in the package. Sorry about that.

p-e-w commented 9 years ago

I see... looks like https://github.com/atom/atom/pull/6757 slammed the door pretty hard on code-defined grammars. I notice that the other two affected packages have not been able to adapt to the new convention – one of them switched to a cson-based grammar and the other remains broken :frowning:

A public API for writing scripted grammars would be a big plus. There are too many things that just cannot be done with TextMate-style grammars.

p-e-w commented 9 years ago

Actually, it would suffice if a stable createToken was available. I posted this pull request a year ago in preparation for code-defined grammars, but without a supported method for creating tokens it does not really make sense anymore.

nathansobo commented 9 years ago

Yeah, the TextMate grammars are really limited and I'd like a better–and supported–interface. The unfortunate reality is that we're dealing with trees here, not tokens, and repeating the scopes on every token is like describing a tree over and over again for every path from root to leaf. So I'd like any interface we come up with to acknowledge that reality.

This change shouldn't be the death of these packages by any means. It should be pretty easy to use atom.grammars.startIdForScope and endIdForScope to register start/end scopes. If you season string run lengths in between as non-negative integers you'll match the format.

I would be interested in your thoughts on a supported interface. There's sort of 2 levels we could do. One would be just a drop-in replacement for the TextMate grammar that is expected to parse a single line and where we thread a state object between lines. Another is much more general, where we inform the parser of change events and allow it to manage all its state itself. Either way, I think the interface would involve this nested word style open/close scope interface.

p-e-w commented 9 years ago

IMO it would be best if the tokenizeLine API returned to what it was and the translation from the classical "value/scopes" representation to the cryptic but efficient integer array was performed outside of tokenizeLine. That way, grammars can tag line tokens in an intuitive way and yet the memory usage remains low (whatever memory is allocated inside individual tokenizeLine calls should be minimal and will be garbage collected soon enough. The size of source files comes from their number of lines, not their line length.) Transforming from the verbose token format to a nested representation is an easy algorithmic task.

p-e-w commented 9 years ago

I managed to update the format and my package is working again. Thank you for your help!

nathansobo commented 9 years ago

You're very welcome. I'm open to your idea of a more convenient interface, but now isn't a great time as I'm focused on things for 1.0. Expanding the API is always a careful process, and we haven't even contemplated custom grammar extensions at all. If you could extract a library from your package that others could use to make the conversion, that would be pretty nice. No pressure though: :smile:

lock[bot] commented 5 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. If you can still reproduce this issue in Safe Mode then please open a new issue and fill out the entire issue template to ensure that we have enough information to address your issue. Thanks!

atom / atom

"Uncaught TypeError: Cannot read property '0' of undefined" when switching to package-provided grammar #7124