Line numbering for the Random variables and Global variables fields

dbauer-ets commented 2 years ago

Hello Philipp,

I started working on a small script for line numbering for the Random variables and Global variables fields. So far it's been pretty easy, but with programming you never know and I could get stuck at any time. Anyway, here's what it might look like:

GitHub_20221114_2340

dbauer-ets commented 2 years ago

Hello Philipp,

Here is a standalone working JavaScript code for the Global variable field (the script for the Random variables field will be identical). It is not 100% but close and would be, in my opinion, good enough for production.

It is not integrated into the question code, because it would take me more time as I am still not that good with PHP and I may need your help.

It was fun to do since I believe you told me that it would be very hard to do. :-). In fact, it is not very complicated.

You can try the code by putting it in an HTML (now Text) block in the quiz.

<!--
© 2022 Dominique Bauer
CC0 1.0 Universal Public Domain Dedication
-->

<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
<script>

// $textareatext: Contents of the Global variable field
// $textareasplit: Array containing the contents of each line of the Global variable field
// n: line counter
// linenumbers: array containing the line numbers

var $textareatext = $("div#fitem_id_varsglobal textarea").text();
var $textareasplit = $textareatext.split("\n");
var n = 0;
const linenumbers = [];

$(document).ready(function() {
    $textareasplit.forEach(line_numbers_define);
    $("div#fitem_id_varsglobal textarea").css({"line-height":"22px","padding":"0 0 0 7px","font-size":"14px","margin":"0"});
    $("div#fitem_id_varsglobal div:eq(0)").removeClass('col-md-3').addClass('col-md-2');
    $("div#fitem_id_varsglobal div:eq(1)").css("padding-left","0");

    $('<div id="line_numbers" style="color: #aaa;border: 1px solid transparent; width:50px;text-align:right;font-size:14px;padding:0 7px 0 0; line-height:22px;overflow: hidden;margin:0;"></div>').insertAfter("div#fitem_id_varsglobal div:eq(0)");

    $("div#line_numbers").html(linenumbers);
    setInterval(dynlineheight, 10);

    var $divs = $('#id_varsglobal, #line_numbers');
    var sync = function(e){
        var $other = $divs.not(this).off('scroll'), other = $other.get(0);
        var percentage = this.scrollTop / (this.scrollHeight - this.offsetHeight);
        other.scrollTop = percentage * (other.scrollHeight - other.offsetHeight);
        setTimeout( function(){ $other.on('scroll', sync ); },10);
    }
    $divs.on( 'scroll', sync);
});

$("div#fitem_id_varsglobal textarea").keyup(function(){    
    linenumbers.splice(0, linenumbers.length); // Remove all array elements.
    $textareatext = $(this).val(); // Get contents of field on each keyup.
    $textareasplit = $textareatext.split("\n");
    n=0;  
    $textareasplit.forEach(line_numbers_define);
    $("div#line_numbers").html(linenumbers);
});

function dynlineheight() {
    // Make height of #line_numbers equal to that of #fitem_id_varsglobal textarea
    var linenumbersheight = $("div#fitem_id_varsglobal textarea").height();
    $("div#line_numbers").css("height",linenumbersheight+2)
}

function line_numbers_define(item, index) {
    var item = item.trim();  // Remove white spaces (beginning and end of string).
    var pattern = /^#/; // First character is #.
    var result = item.match(pattern);
    if ( item == "" || result == "#" ) { // If blank line or comment.
        linenumbers[index] = "<br>"
    } else {
        n = n + 1;
        linenumbers[index] = n+"<br>"
    }
}
</script>

PhilippImhof commented 2 years ago

Oh beautiful! This looks nice and I think it could be an interesting improvement for people who have large definitions of global and/or random variables.

There is one thing to keep in mind: As long as there is no semicolon, the parser considers everything as "one line", so if we have the following:

# bla
a=[1,
2,
3];
b=4+;

The error message will be Try evalution error! 2: Some expressions cannot be evaluated numerically.

I had an alternative approach in mind: After the validation (which will done instantly after leaving the field after the update from PR #47), we could simply highlight the part of the definition where we think the error has happened. That would be much more lightweight IMHO and less likely to break with layout changes, e.g. other themes.

What do you think? If you want to, I can take your script as a base and try to tweak it a bit so that it can deal with the problem mentioned above. Also, I would then rewrite it, because we are not supposed to depend on jQuery. (But that's not a big deal, I suppose.)

PhilippImhof commented 2 years ago

Ah and I forgot to mention the other case where one has a=1; b=2; c=xxx; In this case the message will say the error is on line 3.

PhilippImhof commented 2 years ago

I fixed both of these cases. Please let me know if you prefer to stick with this approach or with the idea of selecting the range where the error (most likely) is.

If we go ahead with this variant, we have to do so very carefully as to not break the existing layout, especially not with the default theme. And to have it working with all major browsers and versions. This will be the toughest part, especially because it is virtually impossible to have automated testing for it. (Automated testing is possible for the main functionality though.)

PhilippImhof commented 2 years ago

What do you think of this? (The right side is not good yet, but I'll find a solution.)

dbauer-ets commented 2 years ago

people who have large definitions of global and/or random variables.

What is 'large'? I have simple questions with over 100 lines of code. I'm sure other people may have much longer codes.

There is one thing to keep in mind: As long as there is no semicolon, the parser considers everything as "one line"...

You're right, I hadn't thought of this one.

a=1; b=2; c=xxx;

I didn't think of this one either. Thanks.

I fixed both of these cases.

How did you do, just replace "\n" with ";" or something a bit more complex?

I had an alternative approach in mind...

I don't think we have to choose one or the other. The two approaches are complementary.

Highlighting where there is an error is useful for debugging.
Line numbering also makes it easier to find your way around the code when writing it. Some people may also feel more confident if they see line numbers.

If we don't have line numbers, then mentioning the line number in the message e.g. "Try evalution error! 2:..." is still incomplete. We refer to the line number, but the user has to count them him/herself. Remove the reference from the error message? On the other hand, all programming editors have line numbers...

That would be much more lightweight...

The line numbering code might be a bit bigger, but hopefully not big enough to be a hindrance to using it.

we are not supposed to depend on jQuery

I know. In my opinion, this is nonsense.

But that's not a big deal, I suppose.

I don't think there is much jQuery in the code. It can be rewritten in plain JavaScript, but jQuery is so much easier!

... not break the existing layout

The code only inserts a div between two existing div's on the same line. I doubt it will break the layout.

especially not with the default theme

I've seen many issues with other themes. The code works with Boost and Classic, and I wouldn't worry about other themes.

working with all major browsers and versions...

It works with the latest version of Chrome, FireFox and Microsoft Edge. I don't have an Apple computer to check but it probably works with Safari too.

automated testing for it.

How many tests do we need? This is basically just formatting.

I would then rewrite it

Philipp, thank you for your help.

I just saw a small bug. When the window is narrow, the Variables field moves to the next line but not the line numbering. However, this shouldn't be too hard to fix.

dbauer-ets commented 2 years ago

What do you think of this?

I think it looks great. If possible, I would replace:

9
9
9

with:

9
.
.

PhilippImhof commented 2 years ago

What is 'large'? I have simple questions with over 100 lines of code. I'm sure other people may have much longer codes.

Wow... I hardly ever get over 25 lines with the questions I create. I would say everything above 10-15 lines is large enough for line numbers to be useful.

How did you do, just replace "\n" with ";" or something a bit more complex?

I added two things: one to not increase the counter if a line does not contain a semicolon and another one to count additional semicolons and add them to the increase step. There's however one case left that I haven't covered yet: semicolons can also appear in strings or comments. These should, of course, not be counted.

I don't think we have to choose one or the other. The two approaches are complementary.

You are absolutely right. And they don't necessarily have to be introduced at the same time. Line numbers are definitely one big step forward. As you write, I always found it strange to have "line numbers" (putting it in quotes, because it's not really the line in many cases) in the error message, but the users have to count by themselves.

The code only inserts a div between two existing div's on the same line. I doubt it will break the layout.

I meant this:

screenshot

The textarea is not aligned anymore. But that can be resolved; the screenshot in my other comment almost gets it straight.

How many tests do we need? This is basically just formatting.

IMHO every important aspect of every new functionality needs an automated test -- don't worry, I will write them. There will probably be certain things we cannot test, e.g. the correctness of the layout. But people are free to submit bug reports with screenshots. Also, the line numbers are not a crucial functionality; even if they don't show up as they should, users can still work with their questions.

I will probably not be able to finish this before the next release which planned in ~ 2 weeks. But I'll keep working on it and open a PR once I have written the initial version.

PhilippImhof commented 2 years ago

Just one more quick idea: What do you think if we just add the line numbers on-the-fly whenever there is a validation error, and remove them again, as soon as the problem is fixed? That has the following advantages:

if there is a layout problem, it is just temporary and thus less important
the line numbers do not change while the users write (we must have that change because of the way the parser counts the lines, but it might seem confusing to the user
there is less updating involved (line numbers, scrolling), so better performance

As the user will only need numbered lines for debugging, it won't hurt if they are not there all the time. And if we show them only when there is an error, we could mark the problematic line numbers in red color.

dbauer-ets commented 2 years ago

There's however one case left that I haven't covered yet: semicolons can also appear in strings or comments.

You're right again! Luckily, finding ';' within quotes shouldn't be too difficult with RegExp.

it's not really the line in many cases

What the error message refers to is, in reality, the "statement" number (https://www.w3schools.com/js/js_statements.asp), not the line number. If we keep the error message as it is, then

are actually 'statement' numbers. Imo this is fine, although usually editors show 'line' numbers.

It would be much simpler to show 'line' numbers, i.e. but just checking the "\n" characters as it is usually done, but then we would need to adjust the error message, which presently gives the statement number, to give the line number.

The textarea is not aligned anymore...

As long as both "Random variables" and "Global variables" field are aligned, I see no big problem. Note that the Random variables field should also be numbered, but the treatment is identical to that for the Global variables field.

I am more concerned about the statement numbers staying attached to the variables field on narrow screens.

But I'll keep working on it

Did you open a branch for this?

I will probably not be able to finish this before the next release...

Not important, we'll implement this when it is ready.

What do you think if we just add the line numbers on-the-fly whenever there is a validation error, and remove them again, as soon as the problem is fixed?

I don't see the advantages, imo:

there is no layout problem.
on the contrary, it is a good thing the statement numbers change while the uses write. Of course 'line' numbers, if we use them, would not change.
optimal performance while users are editing the questions is not of utmost importance.

we could mark the problematic line numbers in red color

Why not, if it's not too difficult.

Here's an excerpt from a recent post (https://moodle.org/mod/forum/discuss.php?d=440588) on the Moodle forum:

Previewing the output is also a hit-or-miss event. Feels weird to have 103 revisions on one question because you are trying to figure out the correct function to use in the Global Variables textarea. Btw, why is such an important component just a textarea? I think only a full online IDE will justify the importance of this step in the quiz authoring process!

This person is expecting a full IDE for the two Variables fields. Displaying statement numbers is far from an IDE, but at least a small improvement over what is displayed now.

If you want to do something for this coming version, you could simply highlight the portion of the code where the error is. But preferably it would have to be the correct location, not

where the error (most likely) is.

That will give us the time to decide if we want to show the line numbers or the statement numbers, or something else for example the Atto editor has line numbers in HTML mode. Maybe we could use a simplified (w/o formatting) Atto editor, although I don't see the advantage since adding 'line' numbers is very easy.

dbauer-ets commented 2 years ago

I think the following would be a good approach:

Show the line numbers, not the statement numbers, next to the variables field. This is what is usually done and it will also simplify the code.
Highlight the line where the statement with an error is located.
Remove the statement number in the error message, as it would be confusing if it doesn't match the line number, and also it would not be really necessary anymore.

Highlighting the line where the statement with an error is located implies that we can correlate statement numbers with line numbers. If you haven't already solve this, I think it would be relatively easy to do by counting the number of non-string ';' on each line (ending with '\n').

That's it. Problem solved.

PhilippImhof commented 2 years ago

it's not really the line in many cases

What the error message refers to is, in reality, the "statement" number

Yes, exactly. And in many cases the line number and the statement number are not the same. They are, however, until the occurrence of the first comment in cases where each statement begins on a new line and does not span several lines.

It would be much simpler to show 'line' numbers, i.e. but just checking the "\n" characters as it is usually done, but then we would need to adjust the error message, which presently gives the statement number, to give the line number.

IMHO it is not worthwhile, because the parser should be completely rewritten anyway.

As long as both "Random variables" and "Global variables" field are aligned, I see no big problem.

They also have to be aligned with all the other fields of the form, of course. :)

I am more concerned about the statement numbers staying attached to the variables field on narrow screens.

No worries, I have found a solution for that.

Did you open a branch for this?

Not yet. I am still experimenting and trying out several variants.

there is no layout problem.

optimal performance while users are editing the questions is not of utmost importance.

Hmmmm.... I think we agree to disagree for that point. :) Performance must be good enough to avoid any kind of lags during input. The line numbers' scrolling must be absolutely in sync with the text's. But those problems can be solved. They almost are, actually.

Here's an excerpt from a recent post (https://moodle.org/mod/forum/discuss.php?d=440588) on the Moodle forum:

Previewing the output is also a hit-or-miss event. Feels weird to have 103 revisions on one question because you are trying to figure out the correct function to use in the Global Variables textarea. Btw, why is such an important component just a textarea? I think only a full online IDE will justify the importance of this step in the quiz authoring process!

With PR #47 the fields will be validated immediately, so the user does not have to save the question anymore just to be able to see possible errors.

This person is expecting a full IDE for the two Variables fields. Displaying statement numbers is far from an IDE, but at least a small improvement over what is displayed now.

I'd say that a full IDE (whatever that is supposed to mean...) is a bit too much to ask. But they are of course free to submit a PR :)

That will give us the time to decide if we want to show the line numbers or the statement numbers, or something else for example the Atto editor has line numbers in HTML mode. Maybe we could use a simplified (w/o formatting) Atto editor, although I don't see the advantage since adding 'line' numbers is very easy.

Atto is going to disappear. Moodle 4.1 will make TinyMCE the default editor, although not with the first release. My last variant goes in the direction you mentioned. It simulates an editor with line numbers incorporated. That solves the problem of line numbers being detached and also fixes the lag in scrolling. I'm sure you are going to like it...

Show the line numbers, not the statement numbers, next to the variables field. This is what is usually done and it will also simplify the code.

I'm not quite sure I get your point. What is the use in showing line numbers if the parser does not work with lines and thus line numbers don't show up in the error message? When defining my variables, I could not care less about which line I am currently on. That only becomes important when there is a problem that I need to debug.

Remove the statement number in the error message, as it would be confusing if it doesn't match the line number, and also it would not be really necessary anymore.

So it would probably be better to indicate statement numbers, especially thinking of multi-line statements. The error might be on the third line of the statement. We have generally no way to find out where in the statement the error occurred.

Highlighting the line where the statement with an error is located implies that we can correlate statement numbers with line numbers. If you haven't already solve this, I think it would be relatively easy to do by counting the number of non-string ';' on each line (ending with '\n').

The parser code is complicated and hardly maintainable, to say the least. I'd rather not touch it, except for replacing it entirely. To outline it briefly, the parser detects an error while working on one single assignment. It then combines the error message (from the Exception) with the number of the assignment (and throws a new Exception). But at that point, it has absolutely no idea of the whole rest of the input. So in order to find the line range where the error happened, one has to go back and match the problematic expression with the entire input.

So, yes, we can probably correlate the statement and line numbers for many cases. But IMHO it's much easier to indicate statement numbers and leave the parser and its error messages as they are. I liked your idea of having just a dot for subsequent lines of the same statement; that looks nice.

PhilippImhof commented 2 years ago

So in order to find the line range where the error happened, one has to go back and match the problematic expression with the entire input.

Oh, and I forgot to mention that the expression has been altered in between, so when the error happens, we do not have the original expression anymore... To give two examples: e=[1,2,"A"]; becomes @3=[@1,@2,@0]; and e=[1,2,3]; f=e[4,5] becomes @5=[@0,@1,@2]; @6=@7[@3,@4]. So all these replacements would have to be taken back one by one...

dbauer-ets commented 2 years ago

Hello Philipp,

I think we could aim for something like shown below. This could be achieved very simply by just numbering the lines and highlighting the row where there is an error. No need to touch the question parser.

Of course, if you have something you think is better, then go for it. :-)

GitHub_20221117_1649

PhilippImhof commented 2 years ago

I suggest we do the following:

have statement numbers instead of line numbers in a "gutter" next to the editor; basically like you did in your first idea where you left out the comments
leave the parser and its error messages as they are; users can attribute the error easily to the problematic statement when they are numbered
additionally, use the error message (with its statement number) to match it back to the problematic part on the client side and highlight the numbers in the gutter

I still do not fully understand the use of line numbers, if no references are made to them in an error message or the like. What advantage do you see in line numbers vs. statement numbers, other than that IDEs generally use the former. (They do so, because they relate error and warning messages to lines rather than statements.) I only ever look at line numbers when I try to fix an error.

This could be achieved very simply by just numbering the lines and highlighting the row where there is an error.

I would like to relativise that a bit: since the <textarea> field does not provide a functionality to highlight rows as shown in your pictures, we have to "emulate" an editor for this to work. This can be done by creating a <div> with contenteditable=true. (That's what I did.) Such editors have the downside that there are different ways of accessing their content and they do not all work the same way in every browser. That means we can get different results (especially w.r.t. line breaks) depending on the browser. This can, of course, be sorted out. But it leads me to the conclusion that it is not a "very simple" task, because it basically boils down to writing a custom HTML component.

There are a few things left to do before the next release. But once that is done, I'll open a PR and show you some intermediate results.

dbauer-ets commented 2 years ago

Line numbers would have no use except being pretty. If your are comfortable with statement numbers, then let's go for them. Highlighting the statement numbers in the gutter is fine with me.

I did not actually try to highlight lines in the textarea. If I do, I would try to find an easy workaround... if there is one.

:-)

dbauer-ets commented 2 years ago

In fact, in addition to being pretty, line numbers also serve as a reference. For example, I can ask you to check line 53, whether it is a line of code or a comment. It is also more standard and therefore perhaps clearer for users. In my opinion, it would probably be better to use line numbering rather than statement numbering. However, if it is easier for the moment to use statement numbering, I have, as I said, no objection.

dbauer-ets commented 2 years ago

This can be done by creating a <div> with contenteditable=true. (That's what I did.)

Philipp,

Do you intend to actually use <div contenteditable="true">?

PhilippImhof commented 2 years ago

I haven't decided yet. It is among several possibilities I am currently evaluating. I will keep you posted, as soon as I start the development.

dbauer-ets commented 2 years ago

If you are using <div contenteditable="true">, here is an easy way to highlight any given line:

https://jsfiddle.net/d_bauer/sgw5yqd1/56/

The fiddle works in Chrome, Firefox, Edge.

dbauer-ets commented 2 years ago

The previous fiddle transforms the textarea into a div contenteditable. I think it works fine.

The following fiddle uses a div behind the textarea to highlight a given line. The content stays in the textarea.

The code is somewhat more complex but not very much. The drawback is that, as it is, the code requires the textarea to be of fixed size to match that of the div behind. This limitation could perhaps be removed by matching dynamically the sizes of the div and textarea.

https://jsfiddle.net/d_bauer/Ladj7rgt/51/

GitHub_20221119_0017

PhilippImhof commented 2 years ago

Yes, that is all pretty straightforward with divs; that's more or less what I had, as well. It's the big pro of this approach. The con, on the other hand, is -- as I mentioned above -- the lack of a reliable way to retrieve the plain text content. One way is innerHTML, but:

for line breaks, some browsers use <br>, some use <p> and others use <div> -- unless it is an empty line, in which case they use `
special characters are replaced by their HTML entities, this includes < and > or &, but also possibly other stuff people use in their comments
some space characters are replaced by  , e.g. all but the last one in a group of subsequent spaces

One can use textContent instead of innerHTML. But then, there are no line breaks, at least not in all browsers. Or one can use innerText. But then, empty lines will give double line breaks in some browsers. There is contenteditable="plaintext-only", but it's not yet widely available.

I don't know for others, but as a user, I would be fuming if the editor screwed up my line breaks or indentation.

I really like your optimism and I am confident that these are all problems that can be solved. But we must keep in mind that the folks behind TinyMCE, ACE, CodeMirror or even Atto have spent considerable amounts of time to circumvent all those limitations. So this is definitely not something we should take lightly at all. We cannot go live with a solution that only works for 95% of the users, especially because in academic institutions (you know that much better than me, probably...) people can not always choose what browser and what version of that browser to use.

So this really needs to be done carefully. I am looking forward to addressing all those issues and coming up with a suggestion soon.

dbauer-ets commented 2 years ago

In the second fiddle, the content stays in the textarea, just as it is now, so there is no problem parsing it.

A div is used behind the textarea only for coloring lines, thus looking like highlighting. Obviously this div doesn't need to be parsed. The code for picking the correct line to be colored is simple and works in all browsers.

So you have the choice. With fiddle 1, the content is transferred to a div and the problems you mentioned for parsing exist.

With fiddle 2, the content stays in the textarea and none of the problems for parsing exists. A div is used behind only for coloring. The resizing I mentioned can be fixed easily as in my very first code. This would be a realistic not just optimistic solution :-), and I wouldn't be surprised if this is how it is done in other editors.

PhilippImhof commented 2 years ago

How could I miss that... Probably seen too many fiddles in the last few days. That would be a solution without line numbers then. Certainly easier to do and to integrate.

PhilippImhof commented 1 year ago

With the progress made on replacing the entire parsing and evaluation engine of the Formulas Question, we will soon have better and more precise error reporting. Almost all error messages will have an indication of the line and column number. So here's my new idea for a very simple and thus very compatible way of solving this issue:

Bildschirmfoto 2023-04-19 um 11 54 34

I have prepared a quick JSFiddle for it. When an error occurs (error checking is done when the input field loses focus in recent versions) the cursor will be automatically placed at the position indicated in the error message.

IMHO, this solution avoids all kinds of problems with multi-browser support and alignment. What do you think about this, @dbauer-ets ?

dbauer-ets commented 1 year ago

I would ask the same questions you asked above on November 16th. Does this solution work if:

a statement is written on several lines (should this be prevented?),
multiple statements are written on the same line,
there are blank lines and comment lines?

PhilippImhof commented 1 year ago

It's a big yes to all questions. The new engine will know precisely where the error happened. Statements can span multiple lines without a problem, even strings will not have to be terminated on the same line anymore. One can have multiple statements on one line, there will even be a shortcut syntax for multiple assignments like a = b = c = 1. Blank lines and comments are allowed and will do no harm at all. As an example, let's take the following:

a = [1,2,3];
b = 1 
     + 3
# comment
     + a

This will throw an error, because we are trying to add numbers and an array. The error message will be 5:8:evaluation error: numeric value expected, got '[...]' and the cursor will be put right before the final a, because that's the "offending" token.

dbauer-ets commented 1 year ago

Wow! Awesome. This is such an improvement and, in my opinion, quite sufficient.

Other improvements are possible but far from essential:

Show line numbers
In addition to positioning the cursor where the error occurred, underline or highlight that position.
Place a flag near the error.

Congratulations for this excellent work!

FormulasQuestion / moodle-qtype_formulas

Line numbering for the Random variables and Global variables fields #52