mdevils / html-entities

Fastest HTML entities encode/decode library
MIT License
627 stars 83 forks source link

Add map file? #88

Open baseplate-admin opened 1 year ago

baseplate-admin commented 1 year ago

Hi,

Thanks for creating this awesome library. I was wondering if it was possible to add a map file for this repo.

mdevils commented 11 months ago

Hello @baseplate-admin, could you please describe what is your use-case for this?

baseplate-admin commented 9 months ago

Hello @baseplate-admin, could you please describe what is your use-case for this?

Hi thanks for replying, the use case for map files is that when it's enabled, the original code shows up on the inspect element tab on browser.

If source maps are not enabled, it shows up minified codes ( even if i have sourcemap enabled in react + vite ).

Here's a blog post that dives deeper into this.

mdevils commented 6 months ago

@baseplate-admin but why do you need the original code of html-entities in your browser dev-tools? html-entities should be a black-box for your project unless you are the html-entities developer. You shouldn't need to waste browser resources to load and apply the source map. Is there something I'm missing?

baseplate-admin commented 6 months ago

Hi, sorry for not being clear. Let me try to clarify the purpose of source maps.

why do you need the original code of html-entities in your browser dev-tools?

Lets think a scenario like this: I got an error while using the html-entities library. This might be an error in html-entities (or an edge case). If i go over to console.log, i will get a vague error that the error was originally from html-entities, but without sourcemap, it is very hard to debug said error.

You shouldn't need to waste browser resources to load and apply the source map

Please note that sourcemap is only loaded when i open the dev-tools, otherwise it is ignored.


Please note that a developer can choose to disable source-map in their own project, in that case html-entities's source-map will be disabled too. But i am talking about the opposite use-case, ie: developer wants source map in their project, in which case html-entities looks like a jumbled mess.


I might not be able to clarify everything properly (sorry for that)

Here are some resources that might explain the need of source map:

mdevils commented 6 months ago

Hello @baseplate-admin, please check version 2.5.0. I've included source maps into that version.

baseplate-admin commented 6 months ago

Hi, please give me some time. I will see to it :)

tezhm commented 6 months ago

@mdevils hey, webpack keeps complaining it can't find the .ts files for me from the source map in 2.5.0. Tried the latest @types/html-entities with no luck:

WARNING in ./node_modules/html-entities/lib/index.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/index.ts' file: Error: ENOENT: no such file or directory, open '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/index.ts'

WARNING in ./node_modules/html-entities/lib/named-references.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/named-references.ts' file: Error: ENOENT: no such file or directory, open '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/named-references.ts'

WARNING in ./node_modules/html-entities/lib/numeric-unicode-map.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/numeric-unicode-map.ts' file: Error: ENOENT: no such file or directory, open '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/numeric-unicode-map.ts'

WARNING in ./node_modules/html-entities/lib/surrogate-pairs.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/surrogate-pairs.ts' file: Error: ENOENT: no such file or directory, open '/home/terence/project/rgb-web-ui/node_modules/html-entities/src/surrogate-pairs.ts'
szmejk commented 6 months ago

@tezhm I've got those warnings via webpack as well. As a temporary solution I've managed to suppress them with overrides in the package.json:


 "overrides": {
    "webpack-dev-server": {
      "html-entities": "2.4.0"
    }
  },
mdevils commented 6 months ago

Hello @tezhm, @szmejk, thank you for reporting the problem. I've published a fix, please check version 2.5.2.

baseplate-admin commented 6 months ago

Hi @mdevils ,

I did take a look around, it seems that typescript -> javascript transpilation adds some noise to the generated map file, could you please take a look at that?

mdevils commented 6 months ago

@baseplate-admin could you please be more specific?

baseplate-admin commented 6 months ago

Sure, Lets start like this.

This is how the code looks in browser: index.js ```js "use strict";var __assign=this&&this.__assign||function(){__assign=Object.assign||function(t){for(var s,i=1,n=arguments.length;i'"&]/g,nonAscii:/[<>'"&\u0080-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g,nonAsciiPrintable:/[<>'"&\x01-\x08\x11-\x15\x17-\x1F\x7f-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g,nonAsciiPrintableOnly:/[\x01-\x08\x11-\x15\x17-\x1F\x7f-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g,extensive:/[\x01-\x0c\x0e-\x1f\x21-\x2c\x2e-\x2f\x3a-\x40\x5b-\x60\x7b-\x7d\x7f-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g};var defaultEncodeOptions={mode:"specialChars",level:"all",numeric:"decimal"};function encode(text,_a){var _b=_a===void 0?defaultEncodeOptions:_a,_c=_b.mode,mode=_c===void 0?"specialChars":_c,_d=_b.numeric,numeric=_d===void 0?"decimal":_d,_e=_b.level,level=_e===void 0?"all":_e;if(!text){return""}var encodeRegExp=encodeRegExps[mode];var references=allNamedReferences[level].characters;var isHex=numeric==="hexadecimal";return replaceUsingRegExp(text,encodeRegExp,(function(input){var result=references[input];if(!result){var code=input.length>1?surrogate_pairs_1.getCodePoint(input,0):input.charCodeAt(0);result=(isHex?"&#x"+code.toString(16):"&#"+code)+";"}return result}))}exports.encode=encode;var defaultDecodeOptions={scope:"body",level:"all"};var strict=/&(?:#\d+|#[xX][\da-fA-F]+|[0-9a-zA-Z]+);/g;var attribute=/&(?:#\d+|#[xX][\da-fA-F]+|[0-9a-zA-Z]+)[;=]?/g;var baseDecodeRegExps={xml:{strict:strict,attribute:attribute,body:named_references_1.bodyRegExps.xml},html4:{strict:strict,attribute:attribute,body:named_references_1.bodyRegExps.html4},html5:{strict:strict,attribute:attribute,body:named_references_1.bodyRegExps.html5}};var decodeRegExps=__assign(__assign({},baseDecodeRegExps),{all:baseDecodeRegExps.html5});var fromCharCode=String.fromCharCode;var outOfBoundsChar=fromCharCode(65533);var defaultDecodeEntityOptions={level:"all"};function getDecodedEntity(entity,references,isAttribute,isStrict){var decodeResult=entity;var decodeEntityLastChar=entity[entity.length-1];if(isAttribute&&decodeEntityLastChar==="="){decodeResult=entity}else if(isStrict&&decodeEntityLastChar!==";"){decodeResult=entity}else{var decodeResultByReference=references[entity];if(decodeResultByReference){decodeResult=decodeResultByReference}else if(entity[0]==="&"&&entity[1]==="#"){var decodeSecondChar=entity[2];var decodeCode=decodeSecondChar=="x"||decodeSecondChar=="X"?parseInt(entity.substr(3),16):parseInt(entity.substr(2));decodeResult=decodeCode>=1114111?outOfBoundsChar:decodeCode>65535?surrogate_pairs_1.fromCodePoint(decodeCode):fromCharCode(numeric_unicode_map_1.numericUnicodeMap[decodeCode]||decodeCode)}}return decodeResult}function decodeEntity(entity,_a){var _b=(_a===void 0?defaultDecodeEntityOptions:_a).level,level=_b===void 0?"all":_b;if(!entity){return""}return getDecodedEntity(entity,allNamedReferences[level].entities,false,false)}exports.decodeEntity=decodeEntity;function decode(text,_a){var _b=_a===void 0?defaultDecodeOptions:_a,_c=_b.level,level=_c===void 0?"all":_c,_d=_b.scope,scope=_d===void 0?level==="xml"?"strict":"body":_d;if(!text){return""}var decodeRegExp=decodeRegExps[level][scope];var references=allNamedReferences[level].entities;var isAttribute=scope==="attribute";var isStrict=scope==="strict";return replaceUsingRegExp(text,decodeRegExp,(function(entity){return getDecodedEntity(entity,references,isAttribute,isStrict)}))}exports.decode=decode; ```
index.ts ```ts import {bodyRegExps, namedReferences} from './named-references'; import {numericUnicodeMap} from './numeric-unicode-map'; import {fromCodePoint, getCodePoint} from './surrogate-pairs'; const allNamedReferences = { ...namedReferences, all: namedReferences.html5 }; function replaceUsingRegExp(macroText: string, macroRegExp: RegExp, macroReplacer: (input: string) => string): string { macroRegExp.lastIndex = 0; let replaceMatch = macroRegExp.exec(macroText); let replaceResult; if (replaceMatch) { replaceResult = ''; let replaceLastIndex = 0; do { if (replaceLastIndex !== replaceMatch.index) { replaceResult += macroText.substring(replaceLastIndex, replaceMatch.index); } const replaceInput = replaceMatch[0]; replaceResult += macroReplacer(replaceInput); replaceLastIndex = replaceMatch.index + replaceInput.length; } while ((replaceMatch = macroRegExp.exec(macroText))); if (replaceLastIndex !== macroText.length) { replaceResult += macroText.substring(replaceLastIndex); } } else { replaceResult = macroText; } return replaceResult; } export type Level = 'xml' | 'html4' | 'html5' | 'all'; interface CommonOptions { level?: Level; } export type EncodeMode = 'specialChars' | 'nonAscii' | 'nonAsciiPrintable' | 'nonAsciiPrintableOnly' | 'extensive'; export interface EncodeOptions extends CommonOptions { mode?: EncodeMode; numeric?: 'decimal' | 'hexadecimal'; } export type DecodeScope = 'strict' | 'body' | 'attribute'; export interface DecodeOptions extends CommonOptions { scope?: DecodeScope; } const encodeRegExps: Record = { specialChars: /[<>'"&]/g, nonAscii: /[<>'"&\u0080-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g, nonAsciiPrintable: /[<>'"&\x01-\x08\x11-\x15\x17-\x1F\x7f-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g, nonAsciiPrintableOnly: /[\x01-\x08\x11-\x15\x17-\x1F\x7f-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g, extensive: /[\x01-\x0c\x0e-\x1f\x21-\x2c\x2e-\x2f\x3a-\x40\x5b-\x60\x7b-\x7d\x7f-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g }; const defaultEncodeOptions: EncodeOptions = { mode: 'specialChars', level: 'all', numeric: 'decimal' }; /** Encodes all the necessary (specified by `level`) characters in the text */ export function encode( text: string | undefined | null, {mode = 'specialChars', numeric = 'decimal', level = 'all'}: EncodeOptions = defaultEncodeOptions ) { if (!text) { return ''; } const encodeRegExp = encodeRegExps[mode]; const references = allNamedReferences[level].characters; const isHex = numeric === 'hexadecimal'; return replaceUsingRegExp(text, encodeRegExp, (input) => { let result = references[input]; if (!result) { const code = input.length > 1 ? getCodePoint(input, 0)! : input.charCodeAt(0); result = (isHex ? '&#x' + code.toString(16) : '&#' + code) + ';'; } return result; }); } const defaultDecodeOptions: DecodeOptions = { scope: 'body', level: 'all' }; const strict = /&(?:#\d+|#[xX][\da-fA-F]+|[0-9a-zA-Z]+);/g; const attribute = /&(?:#\d+|#[xX][\da-fA-F]+|[0-9a-zA-Z]+)[;=]?/g; const baseDecodeRegExps: Record, Record> = { xml: { strict, attribute, body: bodyRegExps.xml }, html4: { strict, attribute, body: bodyRegExps.html4 }, html5: { strict, attribute, body: bodyRegExps.html5 } }; const decodeRegExps: Record> = { ...baseDecodeRegExps, all: baseDecodeRegExps.html5 }; const fromCharCode = String.fromCharCode; const outOfBoundsChar = fromCharCode(65533); const defaultDecodeEntityOptions: CommonOptions = { level: 'all' }; function getDecodedEntity( entity: string, references: Record, isAttribute: boolean, isStrict: boolean ): string { let decodeResult = entity; const decodeEntityLastChar = entity[entity.length - 1]; if (isAttribute && decodeEntityLastChar === '=') { decodeResult = entity; } else if (isStrict && decodeEntityLastChar !== ';') { decodeResult = entity; } else { const decodeResultByReference = references[entity]; if (decodeResultByReference) { decodeResult = decodeResultByReference; } else if (entity[0] === '&' && entity[1] === '#') { const decodeSecondChar = entity[2]; const decodeCode = decodeSecondChar == 'x' || decodeSecondChar == 'X' ? parseInt(entity.substr(3), 16) : parseInt(entity.substr(2)); decodeResult = decodeCode >= 0x10ffff ? outOfBoundsChar : decodeCode > 65535 ? fromCodePoint(decodeCode) : fromCharCode(numericUnicodeMap[decodeCode] || decodeCode); } } return decodeResult; } /** Decodes a single entity */ export function decodeEntity( entity: string | undefined | null, {level = 'all'}: CommonOptions = defaultDecodeEntityOptions ): string { if (!entity) { return ''; } return getDecodedEntity(entity, allNamedReferences[level].entities, false, false); } /** Decodes all entities in the text */ export function decode( text: string | undefined | null, {level = 'all', scope = level === 'xml' ? 'strict' : 'body'}: DecodeOptions = defaultDecodeOptions ) { if (!text) { return ''; } const decodeRegExp = decodeRegExps[level][scope]; const references = allNamedReferences[level].entities; const isAttribute = scope === 'attribute'; const isStrict = scope === 'strict'; return replaceUsingRegExp(text, decodeRegExp, (entity) => getDecodedEntity(entity, references, isAttribute, isStrict) ); } ``` This is how it should look in browser.

So the problem is, when you are running tsc to transpile files from ts to js, the resultant js file doesn't contain the proper map of the ts file

The easiest solution to me is to use something like esbuild to transpile files from ts to js ( or better vite )

mdevils commented 3 months ago

@baseplate-admin According to the package contents, index.js contains the link to the source map: https://www.npmjs.com/package/html-entities?activeTab=code

And the source map has a link to the source file:

And the source file is included to the package:

Can it be that your build-tool just ignores the source-map and removes the source map link comment?

baseplate-admin commented 3 months ago

Can it be that your build-tool just ignores the source-map and removes the source map link comment?

This could be the case too. But i am using vite as build tool. I have never heard of this issue, nor had this issue in other packages that i use.

I will keep debugging and see what's causing this weird behavior