Current comments are end-of-line comments, introduced by the semi-colon (;). They are filtered at the lexer level, so that anything in a line that follows a semi-colon, up to the end of the line, is ignored by the lexer. Obviously, semi-colons within a string are not interpreted likewise.
The proposition is to extend this mechanism so as to cover inline comments and multi-lines comments as well.
To that end, the lexer should handle semi-colons, outside of a string, in the following way :
single semi-colon (;) should be treated like they are at present, commenting out the rest of the line,
double semi-colon (;;) should comment out up to the next double semi-colon within the same line, or up to the rest of the line if none is found ( this is to ensure most of the existing code, that may use double semi-colons already are not affected by the proposal).
triple semi-colon (;;;) should comment out up to the next triple semi-colon found within the document, or up to the end of the document if none is found. Note that the next triple semi-colon marker to consider should not be within the same line as the opening one, but at least in the line that follows. This is to clearly distinguish multi-line comments from single-line comments.
other markers of higher level (4 semi-colons ;;;; , 5 semi-colons ;;;;;, etc.), in case there are needed, should behave similarly as the triple marker, opening a multi-line comment, down to the next marker of the same level - 4 semi-colons should only match with another 4 semi-colons - and could be used to supersede any comments of lower level falling in between or down to the end of the document if such marker cannot be found.
Here follows a script with an illustration of a possible use of these extended comments :
; Single line comments unchanged, be it at the beginning of a line
print "a" ; or at its end
print ;; "b1 commented" ;; "b2"
;; double semi-colon behaves like end-of-line comments when left unclosed
;;; a multiline comments
print "c" ; first line
print "d" ; second line
;;; ; ends there - note that an additional ; is needed here to pacify whatever follows ;;;
; inline or multi-lines comments work as well within a data structure
a: #{
first: [ val1 val2 ;; val3 ;; val4 val5 ] ; val3 removed
; commenting second, third, fourth but not fifth
;;; second: [ val1 val2 ]
third: [ val1 val2 ]
fourth: [ val1 val2 ]
;;; fifth: [ val1 val2 ]
}
; or with any dialect
view [
text "Hello" ;; bold ;; gray
;;;
button "Btn1"
button "Btn2"
;;;
button "Btn3"
]
;;;; multi-line comments of higher order, here just remove whatever follows
print "Not this"
;;;
print "Not that"
;;;
print "Nor this"
Comment markers do not have to be nested, nor balanced :
the regular semi-colon dismisses whatever characters follows in the line even if it is another comment marker : typically if a double semi-colon or more were to follow a single semi-colon in the same line, they would be ignored - current behaviour.
the double semi-colon if not balanced behaves like the regular semi-colon, and when balanced dismisses whatever is within it. Likewise, whatever comment marker may fall in between should be ignored.
the triple semi-colon if not balanced will dismiss everything down to the end of the document, and if balanced anything between these two markers.
What benefit is expected : a means of inline or multi-line commenting that :
works straightaway, in present, as in the future, with any data structure, any dialect,
is little intrusive as possible - you don't need to restructure the code to add a comment - just enclose between the appropriate markers, whatever code needs to be commented out,
is not consuming any new punctuation, or syntactic marker, that might be useful to others or in the future
What this proposition is not :
a new implementation for comments : basically it is intended to be as little regressive as possible (see below),
an annotation mechanism or a documentation mechanism for red code or for red data structure. Currently, comments, introduced by the semi-colon, are wiped out and don't reach the next level (syntactic). That is the expected behaviour here as well. If a more modern comment scheme is needed, this is not targeted by this proposal.
Such a change, though highly conservative, might still impact existing code. This may happen in the following situations :
;;; is already used in an existing comment - for instance for presentation purposes
;; is used twice in the same existing comment, and more particularly when commenting another comment. Here is the pattern : ;; comment1 ;; comment2. In such situation the second comment might become relevant.
finally a problem may arise when using the block comment feature of an IDE that is not well behaved : a bad commenting feature prepends the line to be commented with a single semi-colon. If the line happens to be commented already, that may have adverse effects. However, a good behaved feature add instead a semi-colon followed by a space, which behaves the same. This is how the feature is implemented in Visual Studio Code for instance. This behaviour should be generalised.
The table below shows, for various repos, the number of lines and files that would be impacted by the proposal : whether condition (1) or (2) is met. The same computation was made against the legacy rebol scripts (here), using (.r) file filter instead of (.red -o .reds). That gives a feel of what might be Rebol coding habits around. This is the worst case, for which at most 0.12% of the code lines are impacted by the proposed and those lines can be easily corrected adding "; " in front of those lines.
Repo
Nb Lines (a)
Case (1) - Lines with ;;; (b)
Case (2) - Lines with ;; (c)
Nb files (d)
Nb files affected (e)
%Lines affected (f)
%Files affected (g)
red
231,295
2
0
404
1
0.001 %
0,25%
code
97,214
0
0
205
0
0%
0%
community
16,312
4
2
54
2
0.04%
3,7%
VScode-extension
4,867
0
0
9
0
0%
0%
Rebol script library
339,759
347
75
1242
49
0.12%
3,9%
Following are the unix commands that compute the values of the table. Basically, each command collects all red and reds files in the hierarchy, then apply to it a grep that retrieves the lines that might be troublesome. Further down the pipe, it counts how many such lines were detected using wc.
Current comments are end-of-line comments, introduced by the semi-colon (;). They are filtered at the lexer level, so that anything in a line that follows a semi-colon, up to the end of the line, is ignored by the lexer. Obviously, semi-colons within a string are not interpreted likewise.
The proposition is to extend this mechanism so as to cover inline comments and multi-lines comments as well.
To that end, the lexer should handle semi-colons, outside of a string, in the following way :
Here follows a script with an illustration of a possible use of these extended comments :
Comment markers do not have to be nested, nor balanced :
What benefit is expected : a means of inline or multi-line commenting that :
What this proposition is not :
Such a change, though highly conservative, might still impact existing code. This may happen in the following situations :
The table below shows, for various repos, the number of lines and files that would be impacted by the proposal : whether condition (1) or (2) is met. The same computation was made against the legacy rebol scripts (here), using (.r) file filter instead of (.red -o .reds). That gives a feel of what might be Rebol coding habits around. This is the worst case, for which at most 0.12% of the code lines are impacted by the proposed and those lines can be easily corrected adding "; " in front of those lines.
Following are the unix commands that compute the values of the table. Basically, each command collects all red and reds files in the hierarchy, then apply to it a grep that retrieves the lines that might be troublesome. Further down the pipe, it counts how many such lines were detected using wc.
Related