Open vwkd opened 2 years ago
Ok, I might have found a possible cause. After testing multiple pages with such symbols, I sometimes got the more specific error Expected valid tag name
. Looking at the error message, the character entities were converted. It seems like they were interpreted as HTML, which in the case of <
and >
would lead to the invalid tag. This seems like it is a bug.
Though there might be more causes to the "offset is longer than source length" error. I also got it when writing some LaTeX on a page, e.g.\{0,1,\ldots,9\}
. Sometimes I got the more specific error Expecting Unicode escape sequence \uXXXX
. Of course, LaTeX is invalid markup and an error isn't a bug here, but again the cryptic "offset is longer than source length" doesn't help much.
Both examples seems to highlight the same issue that invalid markup isn't reported properly as error, then it progresses down the chain until it eventually ends up being served by Vite which throws the cryptic "offset is longer than source length" error.
Could it be that remark fails to throw an error during parsing, or that MDsveX swallows it?
one problem is that vite throws these usless stack traces
Error: offset is longer than source length!
at numberToPos (node_modules/vite/dist/node/chunks/dep-972722fa.js:4247:15)
at formatError (node_modules/vite/dist/node/chunks/dep-972722fa.js:51169:24)
at TransformContext.error (node_modules/vite/dist/node/chunks/dep-972722fa.js:51149:19)
at Object.transform (node_modules/vite/dist/node/chunks/dep-972722fa.js:51357:25)
at async transformRequest (node_modules/vite/dist/node/chunks/dep-972722fa.js:67098:29)
at async instantiateModule (node_modules/vite/dist/node/chunks/dep-972722fa.js:73732:10)
with https://github.com/vitejs/vite/pull/4782 i get
formatError: error in numberToPos (Error: offset is longer than source length! offset 126 > length 48) while handling the error:
Expected valid tag name
4: </script>
5:
6: <p>Lorem ipsum < dolor sit</p>
^
ParseError: Expected valid tag name
at error (node_modules/svelte/compiler.mjs:17698:19)
at Parser$1.error (node_modules/svelte/compiler.mjs:17774:9)
[...]
digging deeper
output
mdsvex cjs 8727: found entity in lt. named <. type named
mdsvex cjs 8827: push reference <
where does this code come from? lets ask github - looks like parse-entities
some traces later, this is called by remark-parse.markdown
so the problem is ...
const toMDAST = unified()
.use(markdown) // HERE
simple but wrong:
escaping <
with &lt;
and >
with &gt;
would give false positives
for example
<
<div title="<"/>
would become
<
<div title="&lt;"/>
so either set options via .use(markdown, options)
or patch the markdown parser
quickfix
// cjs js line 8716
if (terminated) {
end++;
namedEntity = type === name$1 ? decodeEntity_1(characters) : false;
// QUICKFIX
if (namedEntity == '<') namedEntity = '<';
if (namedEntity == '>') namedEntity = '>';
if (namedEntity) {
entityCharacters = characters;
entity = namedEntity;
}
much faster than fixing all the dependencies : ) how to apply with patch-package
git clone --depth 1 https://github.com/milahu/sveltekit-bugs.git --branch quickfix-with-patch-package
cd sveltekit-bugs
pnpm install
npm run dev
Late coming to this but I don't think <
or >
are really supported in the underlying parsers, entities seem to be decoded when parsed.
I think there are two issues about this now, can you check to see if this issue is reporting the same problem?
can you check to see if this issue is reporting the same problem?
yes, its a duplicate
source here
Lorem ipsum < dolor sit
source there
A paragraph started with <h2>
with my quickfix, this is correctly rendered as
A paragraph started with <h2>
Thank you!
In v1, I have a plan to make the use of <
, >
character entities unnecessary, supporting <
and >
directly in text. But the entity situation will be resolve when mdsvex moves to a different parser.
This issue has more information, I'll close the other.
when mdsvex moves to a different parser.
in the mean time, you could use patch-package to patch the dependency remark-parse.markdown
so that your consumers dont need to patch mdsvex
@milahu Thanks you for digging into this and also for fixing the error message in Vite!
I'm thinking it may be a good idea to submit a PR upstream to remark-parse.markdown
. I have the feeling it will take some time for MDsveX to move to a new parser, so it's waiting either way. Fixing upstream would be the right thing to do even if we have to wait more. This way MDsveX doesn't need to maintain a custom patch and also other projects would benefit from it.
EDIT: Are you sure it's remark-parse
? It seems it just a wrapper around mdast-util-from-markdown
. Looking into mdast-util-from-markdown
, it has a single reference to parse-entities
at https://github.com/syntax-tree/mdast-util-from-markdown/blob/c79a430b8f6b0e3aa4c0981459510283ffe037a0/dev/lib/index.js#L960. Might that be it?
Might that be it?
yepp
$ cd MDsveX/packages/mdsvex
$ sed -i 's/sourcemap: false/sourcemap: true/' rollup.config.js
$ npm run build
$ npm i -g source-map-cli
$ source-map resolve dist/main.cjs.js.map 8722 0
Maps to node_modules/parse-entities/index.js:254:0
if (namedEntity) {
^
thats the problem with fixing upstream: the code is hidden in some dependency, and we would have to pass our option across all the interfaces
do you have so much time to waste? ; )
I'm not sure an upstream PR would be accepted anyway, I'm pretty sure this is by design from previous conversations I've had.
I'll Patch this dependency for the short term when I find the time (if anyone wants to submit a PR, I'll happily accept it).
Description
When using HTML character entities like
<
in a MDsveX page with SvelteKit, Vite throws anExpected valid tag name
error(Note, before Vite 2.5.4 and SvelteKit 1.0.0-next.165 the error was a cryptic
offset is longer than source length!
. Thanks to https://github.com/vitejs/vite/pull/4782 this is now fixed.)Reproduction
https://github.com/vwkd/sveltekit-bugs/tree/mdsvex-character-references
npm run dev
Expected valid tag name
System Info
Additional context
Maybe related to https://github.com/sveltejs/svelte/issues/6440 ?