ChrisSteinbach / mscgen

Automatically exported from code.google.com/p/mscgen
GNU General Public License v2.0
0 stars 0 forks source link

mscgen echoes the non-printable "\r" and unrecognised symbols. #14

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Severity: low

What steps will reproduce the problem?
Have mscgen process a file using "\r\n" (Windows/network) newlines and pipe
it to a program that displays "\r" entries (e.g. less or hexdump). A good
sample file is testinput09.msc from the tests folder.

Using "fromdos" can remove this problem from an input file and todos can be
used to (re-)introduce it.

NB: This is not limited to "\r" only. Any symbol not accounted for in the
flex will also be "echo"'ed. e.g. a file containing ".." (and not "...")
will also print the ".." to stdout. While the parser will notice this as an
invalid symbol and fail, the ".." will still be printed (after the error
message) causing the prompt to be "indented".

What is the expected output? What do you see instead?
Expected: "nothing what so ever".
Seen: a lot of "\r" ("less" displays it as "^M", "hexdump" shows the byte
code "0d"). 

What version of the product are you using? On what operating system?
mscgen from trunk revision 32.

Please provide any additional information below, including sample input
file:
testinput09.msc

I believe the newline part can be fixed by changing the lex rule:
\n  lex_linenum++;

into this:
[\r\n]|[\r][\n]  lex_linenum++;

To avoid echo'ing of "other symbols" I suggest a "catch all" rule for them.

flex's "-s" switch may be useful in the "endeavor" since it disables the
"default rule" and aborts on unmatched symbols.

~Niels

Original issue reported on code.google.com by NThykier@gmail.com on 3 Jul 2009 at 11:34

GoogleCodeExporter commented 9 years ago
On a related note: Unterminated "quoted strings" are also a concern.

The sample:

msc {
        a,b;
        a->b[label="my text];
}

will give the following error:
  Error detected at line 3: syntax error, unexpected 'string', expecting ',' or ']'.

But the true error is that I was missing a " to terminate the label text. What
happens here is that the scanner is forced to reject the "TOK_QSTRING" rule and 
since
no other rule matches (starts with) a ", it falls back to using the default rule
(printing the " to stdout). Now it parses two strings ("my" and "text") 
separately,
which causes the parser to choke. Had the sample text been a single word it 
would
have continued without any errors and just printed the " to the screen.

This particular problem can be fixed by (among things) inserting a rule:
\"(\\\"|[^\"])*         { fprintf(stderr, "Unterminated quoted string at line 
%d.\n",
lex_linenum); exit(EXIT_FAILURE); }

(and add the #include <stdlib.h> for exit() and the "failure macro")

~Niels

Original comment by NThykier@gmail.com on 4 Jul 2009 at 4:35

GoogleCodeExporter commented 9 years ago
This issue was closed by r35.

Original comment by Michael....@gmail.com on 5 Jul 2009 at 8:19