Open andrewcarver opened 2 years ago
P.S. On those screenshots, my themes-configuration was:
themes: [ "atom-light-ui" "one-dark-syntax" ]
P.P.S. I was relying on my screenshot filenames to identiy what was what -- not realizing the filenames would get changed!
So: The first screenshot is the view with the OLD regex; the next screenshot is the view with the NEW regex.
Hello,
as we have an issue with our CI, can you send the output of the tests launch?
Sure: I ran the tests on both of the versions I suggested above -- the one with the outer-wrapping capturing group, and the one without.
The results were the same for both, except that with the capturing group included, there was an extra "deprecated warning". Here are the full results, including that extra warning:
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\pathwatcher\build\Release\pathwatcher.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\oniguruma\build\Release\onig_scanner.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\superstring\build\Release\superstring.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\tree-sitter\build\Release\tree_sitter_runtime_binding.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\@atom\nsfw\build\Release\nsfw.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\git-utils\build\Release\git.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\scrollbar-style\build\Release\scrollbar-style-observer.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17976) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
..............................................................................................................................................................................................................................................................................................................................................................................
Finished in 20.741 seconds
366 tests, 2483 assertions, 0 failures, 0 skipped
The warning included here but NOT included when I tested without the extra capturing group, is that last one:
(node:17976) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
If we're not really wanting to capture the match inside this group, then the above warning seems more than enough reason to omit that extra capturing group. (We might even want to make ALL the remaining groups non-capturing.) Please let me know, if you'd like me to change my PR code to omit that group.
I'd like to add a tangential comment: My earlier screenshots had attribute-list-highlighting that was ... slightly incorrect. This was only because of the styling, not because of the markup: I was running, at the time, a custom grammar package of my own that 'include'd the main grammar files from your package (with just the regex tweaked); but I'd forgotten, as yet, to copy over your 'styles' folder!
For the above tests, though, I edited my installed copy of your original package (doing it the proper way -- by editing the quote-paragraph grammar-file and re-generating the main file). So, here's a better screenshot, which I took at that time, showing the correct attribute-list-coloring:
If you look close, you might notice that, within the block-attribute-list, the inline macro's opening bracket is marked up like the text surrounding it, but its closing bracket is not (separately) marked up at all, but appears as just text, like the opening and closing bracket of the block-attribute-list. I missed this anomaly at first: you have to look pretty close, even to spot it. So, I'm not worried about it -- it's a perfectly acceptable price to pay, IMHO (unless we can think of a neat fix for it).
P.S. One way to fix that little anomaly might be to use a recursive regex -- which Ruby started supporting in v. 2.0, it seems.
coffeescript is not ruby, it's not the same regex support.
Hmm. Well, as far as I'm aware, JavaScript (must less CoffeeScript?) supports neither recursive regex nor regex with subroutine calls, which would ba another way to approach this ...
Silly idea anyway, I guess: The problem is not with the regex we've fixed, but with the action downstream -- and specifically, I think, the "block-attribute-inner" macro, which our macro "include"s.
I've suggested two changes in a regex in the file "repositories/partials/block-attribute-inner-grammar.cson"
-- an alternate branch with a look-ahead, and a clear-typo fix.
Also, I removed that outer, capturing group that seemed extraneous in the earlier regex-fix.
WIth those regex-fixes in place, I got the following test-run:
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\pathwatcher\build\Release\pathwatcher.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\oniguruma\build\Release\onig_scanner.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\superstring\build\Release\superstring.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\tree-sitter\build\Release\tree_sitter_runtime_binding.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\@atom\nsfw\build\Release\nsfw.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\git-utils\build\Release\git.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:18632) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\scrollbar-style\build\Release\scrollbar-style-observer.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
..............................................................................................................................................................................................................................................................................................................................................................................
Finished in 17.457 seconds
366 tests, 2483 assertions, 0 failures, 0 skipped
I may be wrong, but it seems to me that these regex-fixes may solve the problem!
Hmm ... on second thought, that look-ahead is not QUITE robust enough: it could get fooled by an apostrophe (Fred\'s car
) that's not actually closing the attribute-value!
So maybe something like this would do it:
\\](?=[^',.#%]*'\\s*(,|\\]$))
I gave that a quick trial run, and it seems to work (on my limited sample):
Haven't yet run the test again, though.
Thoughts?
OK, one more suggestion for beefing up the look-ahead, then I shut up:
How about we also check, that any character just before the apostrophe (but, like the apostrophe, subsequent to the \\]
!) is NOT a backslash (\\\\
)?
\\](?=[^',.#%]*?[^\\\\]?'\\s*(,|\\]$))
On a quick test, this also seems to work...
So with that, the whole regex becomes: (?<=\\{|,|\\.|#|\"|'|%)((?:[^\\],.#%]|\\](?=[^',.#%]*?[^\\\\]?'\\s*(,|\\]$)))+)
Here's how Regex Buddy analyzes this regex (but please note that in Regex Buddy, on a Windows, CR/LF machine, just before any $
I have to put \r*
, to make the regex work):
(?<=\{|,|\.|#|\"|'|%)
\{
\{
,
,
\.
\.
#
#
\"
\"
'
'
%
%
((?:[^\],.#%]|\](?=[^',.#%]*?[^\\]?'\s*(,|\]\r*$)))+)
(?:[^\],.#%]|\](?=[^',.#%]*?[^\\]?'\s*(,|\]\r*$)))+
+
[^\],.#%]
[^\],.#%]
\]
,.#%
\](?=[^',.#%]*?[^\\]?'\s*(,|\]\r*$))
\]
(?=[^',.#%]*?[^\\]?'\s*(,|\]\r*$))
[^',.#%]*?
*?
[^\\]?
?
'
\s*
*
(,|\]\r*$)
,
,
\]\r*$
\]
\r*
*
$
Well, it didn't take me long to come up with an example breaking that.
But WAIT A SECOND -- maybe the reason I'm having to work so hard to "beef up" the lookahead, is that I'm "looking ahead" for a too-limited set of things:
Instead of checking to make sure my \\]
is followed by an apostrophe (of a certain class), don't I want to accept and match the \\]
if it's followed (on the same line) by ANYTHING that's not a space-character AND not the end of the line -- that is, if the \\]
is not the one closing the attribute-list??
If so, then this look-ahead should solve the problem:
(?=[ \\t\\f]*[^\\s])
-- which puts us here, for the complete regex:
(?<=\\{|,|\\.|#|\"|'|%)((?:[^\\],.#%]|\\](?=[ \\t\\f]*[^\\s]))+)
Test results for this last commit:
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\pathwatcher\build\Release\pathwatcher.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\oniguruma\build\Release\onig_scanner.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\superstring\build\Release\superstring.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\tree-sitter\build\Release\tree_sitter_runtime_binding.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\@atom\nsfw\build\Release\nsfw.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\git-utils\build\Release\git.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:17956) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\scrollbar-style\build\Release\scrollbar-style-observer.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
..............................................................................................................................................................................................................................................................................................................................................................................
Finished in 17.022 seconds
366 tests, 2483 assertions, 0 failures, 0 skipped
Screen-shot is same as before.
It might be worth mentioning, that the pattern "alternate-branch-with-lookahead" that I applied to the last regex (to allow it to match a contained, inline macro's closing bracket) affords us an alternative fix for the first regex, as well -- one which would avoid lazy quantification (with the back-tracking it introduces).
Here's the current state of the regex-replacement we've substituted into #quote-paragraph -- followed here by the alternate we could use, avoiding lazy quantification:
^\\[(quote|verse)(?:(?:,|#|\\.|%)([^,]+?))*\\]$
^\\[(quote|verse)(?:(?:,|#|\\.|%)([^,\\]]|\\](?=[ \\t\\f]*\\S))+)*\\]$
In trying both of them, my eyes didn't notice any difference in efficiency. But it MIGHT be that a lLITTLE less risk of runaway backtracking attaches to the latter regex ... ??
Just in case Regex Buddy's verbalization of that last regex would be helpful
^
\[
(quote|verse)
quote
quote
verse
verse
(?:(?:,|#|\.|%)(?:[^,\]]|\](?=[ \t\f]*\S))+)*
*
(?:,|#|\.|%)
,
,
#
#
\.
\.
%
%
(?:[^,\]]|\](?=[ \t\f]*\S))+
+
[^,\]]
[^,\]]
,
\]
\](?=[ \t\f]*\S)
\]
(?=[ \t\f]*\S)
[ \t\f]*
*
\t
\f
\S
\]
$
Created with RegexBuddy
I'm fine with giving this a try, and I appreciate the detailed analysis.
I have one question, though. What is \f
in there for? As far as I know, that's not a character AsciiDoc should ever be considering. The only valid space-like characters in AsciiDoc are space and tab (i.e., [ \\t]
).
as we have an issue with our CI, can you send the output of the tests launch?
@ldez we should probably migrate to GitHub Action, any objection?
I have one question, though. What is \f in there for? As far as I know, that's not a character AsciiDoc should ever be considering.
@mojavelinux Lol. Yes, it's a bit silly in this context, isn't it? I'd just noticed that it's included in the whitespace category \s and I threw it in for completeness. But I'm happy to omit it.
Just in case anyone's intrigued, at all, by my last suggestion:
... that the pattern "alternate-branch-with-lookahead" that I applied to the last regex ... affords us an alternative fix for the first regex, as well -- one which would avoid lazy quantification (with the back-tracking it introduces)
I ran the specs on it:
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\pathwatcher\build\Release\pathwatcher.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\oniguruma\build\Release\onig_scanner.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\superstring\build\Release\superstring.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\tree-sitter\build\Release\tree_sitter_runtime_binding.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\@atom\nsfw\build\Release\nsfw.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\git-utils\build\Release\git.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:19432) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\scrollbar-style\build\Release\scrollbar-style-observer.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
..............................................................................................................................................................................................................................................................................................................................................................................
Finished in 16.644 seconds
366 tests, 2483 assertions, 0 failures, 0 skipped
The main arguments I can see for it (without doing some kind of efficiency-analysis above my pay-grade) are that it makes the regex-code somewhat easier to understand:
I'm definitely intrigued. And that sounds like great news. Is there anything else you think needs to be solved before this PR is considered ready?
Nope. I'll go ahead and make that last, "intriguing" change, then -- in a little while ... unless you say "hold up" ...
It all looks ready, now, to me ...
Is there anything else you think needs to be solved before this PR is considered ready?
@mojavelinux I replied in the negative; but now I've got to back-pedal:
Someone please correct me, if I'm wrong: but I don't believe we need the capturing group that I let stand in the regex for block-attribute-inner-grammar.cson
(nor is there need to replace it with a non-capturing group). After all, the only match to which the code there pays attention is (I think) the overall match (match 0):
{
comment: "attributes"
name: "markup.meta.attribute-list.asciidoc"
match: "(?<=\\{|,|\\.|#|\"|'|%)((?:[^\\],.#%]|\\](?=[ \\t]*\\S))+)"
captures:
"0":
patterns: [
{
include: "#attribute-reference"
}
]
}
Any reason we shouldn't delete that capturing group -- for efficiency?
F.y.i., the spec-run on this deletion:
C:\Users\acarver\.atom\packages\language-asciidoc>atom --test spec
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\pathwatcher\build\Release\pathwatcher.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\oniguruma\build\Release\onig_scanner.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\superstring\build\Release\superstring.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\tree-sitter\build\Release\tree_sitter_runtime_binding.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\@atom\nsfw\build\Release\nsfw.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\git-utils\build\Release\git.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
(node:252) Electron: Loading non-context-aware native module in renderer: '\\?\C:\Users\acarver\AppData\Local\atom\app-1.57.0\resources\app.asar.unpacked\node_modules\scrollbar-style\build\Release\scrollbar-style-observer.node'. This is deprecated, see https://github.com/electron/electron/issues/18397.
..............................................................................................................................................................................................................................................................................................................................................................................
Finished in 21.714 seconds
366 tests, 2483 assertions, 0 failures, 0 skipped
I'm happy to proceed. However, I'm not the one who merges to this repo.
Description
The regex for the quote-paragraph rule is not as robust as it could be -- and also, either not as efficient as it could be, or else (if you're REALLY wanting to refer to captured groups) capturing one group incorrectly.
Here's the regex-segment that is in common between the regexes on lines 35 and 37 of the file ("quote-paragraph-grammar.cson"):
^\\[(quote|verse)((?:,|#|\\.|%)([^,\\]]+))*\\]$
The robustness problem is that as it stands, the regex will not match any block-attribute-list that contains, itself, an inline macro -- such as a URL macro (see screen-shot) or a bibliographic-citation macro (see the below syntax example). And no, it doesn't help, if you escape the closing bracket of the inline macro.
This problem is located in the character-group in the regex: In order to get the character-class not to run away with the rest of the quote-block, there was included in it the closing ']' of the block-attribute-list; but one could have achieved the same thing better by making the quantifier "lazy":
instead of
([^,\\]]+)
one could just write([^,]+?)
With that change, the regex matches also these inline-macro-containing quote-block-attribute-lists (see screen shots).
As for the other problem: If you don't really need to capture any groups (i.e., you're not gong to refer to them, as is not done inside the present macro), then the regex will run more efficiently if the capturing groups are made non-capturing groups.
On the other hand, if we are wanting to capture those groups, then we need to fix the second capturing group:
((?:,|#|\\.|%)([^,\\]]+))*
Putting a quantifier on a capturing group, like this, leads to the problem which "Regex Buddy" explains thus:
I took the first, safer route for this problem, in my code-fix, in case we do want to capture:
((?:(?:,|#|\\.|%)([^,]+?))*)
If capturing is superfluous, though, then we can just drop the outermost (the capturing) group -- like so:
(?:(?:,|#|\\.|%)([^,]+?))*
Syntax example
Screenshots