Closed arvillion closed 1 year ago
Looks fine to me. The raw data has c2
a0
byte sequences that is the UTF-8 encoding of U+00A0 (NO-BREAK SPACE). If you see something else you may have a process somewhere in your pipeline that does something to your data.
> xxd spec.json
...
00012190: 2020 7b0a 2020 2020 226d 6172 6b64 6f77 {. "markdow
000121a0: 6e22 3a20 2260 c2a0 62c2 a060 5c6e 222c n": "`..b..`\n",
000121b0: 0a20 2020 2022 6874 6d6c 223a 2022 3c70 . "html": "<p
000121c0: 3e3c 636f 6465 3ec2 a062 c2a0 3c2f 636f ><code>..b..</co
000121d0: 6465 3e3c 2f70 3e5c 6e22 2c0a 2020 2020 de></p>\n",.
000121e0: 2265 7861 6d70 6c65 223a 2033 3333 2c0a "example": 333,.
000121f0: 2020 2020 2273 7461 7274 5f6c 696e 6522 "start_line"
00012200: 3a20 3539 3337 2c0a 2020 2020 2265 6e64 : 5937,. "end
00012210: 5f6c 696e 6522 3a20 3539 3431 2c0a 2020 _line": 5941,.
00012220: 2020 2273 6563 7469 6f6e 223a 2022 436f "section": "Co
00012230: 6465 2073 7061 6e73 220a 2020 7d2c 0a20 de spans". },.
Looks fine to me. The raw data has
c2
a0
byte sequences that is the UTF-8 encoding of U+00A0 (NO-BREAK SPACE). If you see something else you may have a process somewhere in your pipeline that does something to your data.> xxd spec.json ... 00012190: 2020 7b0a 2020 2020 226d 6172 6b64 6f77 {. "markdow 000121a0: 6e22 3a20 2260 c2a0 62c2 a060 5c6e 222c n": "`..b..`\n", 000121b0: 0a20 2020 2022 6874 6d6c 223a 2022 3c70 . "html": "<p 000121c0: 3e3c 636f 6465 3ec2 a062 c2a0 3c2f 636f ><code>..b..</co 000121d0: 6465 3e3c 2f70 3e5c 6e22 2c0a 2020 2020 de></p>\n",. 000121e0: 2265 7861 6d70 6c65 223a 2033 3333 2c0a "example": 333,. 000121f0: 2020 2020 2273 7461 7274 5f6c 696e 6522 "start_line" 00012200: 3a20 3539 3337 2c0a 2020 2020 2265 6e64 : 5937,. "end 00012210: 5f6c 696e 6522 3a20 3539 3431 2c0a 2020 _line": 5941,. 00012220: 2020 2273 6563 7469 6f6e 223a 2022 436f "section": "Co 00012230: 6465 2073 7061 6e73 220a 2020 7d2c 0a20 de spans". },.
Thanks. The raw data looks fine. I think the problem is due to that my browser renders json in a way so that no-break space is displayed as normal space.
I believe there's an error in the example 333 in https://spec.commonmark.org/0.30/spec.json
Character "b" should be surrounded with unicode whitespace(ascii code 160), as shown in https://spec.commonmark.org/0.30/#example-333 However, in the json file, normal whitespaces are used (ascii code 32).