dotnet / docfx

Static site generator for .NET API documentation.
https://dotnet.github.io/docfx/
MIT License
4.09k stars 866 forks source link

[Feature Request] Support for negative lookahead in regex / Use a more updated version of ECMAScript #10133

Open cjsha opened 4 months ago

cjsha commented 4 months ago

Is your feature request related to a problem? Please describe.

Current Docfx supports ECMAScript 5.1 according to the preprocessor section of docfx.

However, negative lookahead in regular expressions is only available in later versions of ECMAScript according to this StackOverflow thread.

Describe the solution you'd like

Ideally, docfx support for a later version of ECMAScript to be able to utilize newer feature such as negative lookahead in regular expressions.

Describe alternatives you've considered

For now, I'm going to more baroque code instead of a simple regex.

Additional context

This is what I'm trying to do: https://regex101.com/r/rAzo2K/1 and I believe it works there because it's using a later ECMAscript engine than the one docfx uses.

However, what happens in practice is that the replacement occurs on the 1st \

element instead of the 2nd \

element.

const re = new RegExp("(<p)(?!.*\1)");
originalString = "<p>\nIf defined, it will override automated voltage discovery and apply the specified voltage to the headstage.\nIf left blank, an automated headstage detection algorithm will attempt to communicate with the headstage and\napply an appropriate voltage for stable operation. Because ONIX allows any coaxial tether to be used, some of\nwhich are thin enough to result in a significant voltage drop, its may be required to manually specify the\nport voltage.\n</p>\n<p>\nWarning: this device requires 5.5V to 6.0V, measured at the headstage, for proper operation. Supplying higher\nvoltages may result in damage.\n</p>\n";

const newString = originalString.replace(re, '<p style="margin-bottom:0;"');
console.log(newString);
//newString = "<p style=\"margin-bottom:0;\">\nIf defined, it will override automated voltage discovery and apply the specified voltage to the headstage.\nIf left blank, an automated headstage detection algorithm will attempt to communicate with the headstage and\napply an appropriate voltage for stable operation. Because ONIX allows any coaxial tether to be used, some of\nwhich are thin enough to result in a significant voltage drop, its may be required to manually specify the\nport voltage.\n</p>\n<p>\nWarning: this device requires 5.5V to 6.0V, measured at the headstage, for proper operation. Supplying higher\nvoltages may result in damage.\n</p>\n"

Thank you so much for your help on this matter

filzrev commented 4 months ago

docfx use Jint for processing JavaScript. So latest version of docfx supports almost features of ECMAScript.

However, what happens in practice is that the replacement occurs on the 1st <p> element instead of the 2nd <p> element.

I thought these behavior differences are caused by Regex backend.

When running following code in Chrome developer console. 1st <p> element is replaced.

const re = new RegExp("(<p)(?!.*\1)");
originalString = "<p>\nIf defined, it will override automated voltage discovery and apply the specified voltage to the headstage.\nIf left blank, an automated headstage detection algorithm will attempt to communicate with the headstage and\napply an appropriate voltage for stable operation. Because ONIX allows any coaxial tether to be used, some of\nwhich are thin enough to result in a significant voltage drop, its may be required to manually specify the\nport voltage.\n</p>\n<p>\nWarning: this device requires 5.5V to 6.0V, measured at the headstage, for proper operation. Supplying higher\nvoltages may result in damage.\n</p>\n";

const newString = originalString.replace(re, '<p style="margin-bottom:0;"');
console.log(newString);