sanchahar / speedcrunch

Automatically exported from code.google.com/p/speedcrunch
1 stars 0 forks source link

Expressions of type a!^b are invalid #74

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. (1+1)!^2
2.
3.

What is the expected output? What do you see instead?
expected 4 as the result, but the parser could not parse the expression

What version of the product are you using? On what operating system?
0.8-alpha, about 3 days old

Please provide any additional information below.

Original issue reported on code.google.com by ooka...@gmx.de on 17 Jul 2007 at 9:02

GoogleCodeExporter commented 9 years ago
I confirmed the behaviour. Can anyone take the time to try to fix the parser in 
time
for 0.8? If not, I'll have to put it in the 0.9 todo list. Could you try, Wolf?

Original comment by helder.p...@gmail.com on 17 Jul 2007 at 9:23

GoogleCodeExporter commented 9 years ago
Analysis shows, that the parsing problems are buried in the file 
"evaluator.cpp" in
function "Evaluator::compile". This function gets a list of tokens from the
tokenizer, and it takes them, one by one, matching it (together with earlier
encountered, so far not processed tokens) against built-in rules. If a rule 
applies,
it issues a sort of pseudo-code, and removes the processed tokens, if not, it 
appends
the token to a pending list, and defers processing until later tokens tell how 
to
deal with it.
If input grammar would be nice and regular, no such deferring would be 
necessary at
all, and parsers could be simple and straightforward. Unfortunately, mankind 
hasn't
been aware of parsing problems in past centuries, and developed a not so nice 
grammar
for mathematical terms.
However, until a few months ago, the only irregularity SpeedCrunch really had 
to deal
with, was the operator precedence. But new operators like '!' introduced another
class of tokens. They made matters even nastier, because postfix operators 
always
change the meaning of the tokens before. The parser has to defer any processing 
until
it definitely knows, no postfix operator is following. In order to meet these 
new
requirements, I think, the parser has to be redesigned to some extent.
This already tells us, that fixing the parser problem(s) is much more than a
one-liner, and I'm not sure, whether you should really rewrite the parser 
(rules) in
such a late state of release. Because fixing one problem might introduce others
somewhere else. And you cannot catch them without extensive testing.
In order to let expressions like "1!^2" execute properly, I have a makeshift 
patch
ready. It touches about 40 lines of code, though, and makes conceptional 
changes by
reordering and deleting existing rules. It does not eliminate the need of a 
redesign
and does not solve all problems with postfix operators (sqrt 4! will still be
evaluated as (sqrt 4)!, but this is currently broken anyway and worth a 
separate bug
entry). It might well break the parsing of other expressions, I don't know, but
maybe, you will get away with it.
You will find a patched version of evaluator.cpp in the floatnum brunch. I 
leave it
to you, whether you apply it or not.

Wolf Lammen

Original comment by ooka...@gmx.de on 18 Jul 2007 at 3:27

GoogleCodeExporter commented 9 years ago
I'm thinking of rewriting the parser. After all this attempt, I guess the 
better way
is to output AST first and THEN generated the reverse-polish version (i.e. the
opcodes) to be executed by the virtual machine, unlike the current table-based
approach to directly output the opcodes.

This rewrite hopefully will be available along with the RPN support, as they 
kind of
support each other.

Original comment by ariya.hi...@gmail.com on 20 Jul 2007 at 2:47

GoogleCodeExporter commented 9 years ago
Good idea. If user requests demand a quick (and dirty) fix, one can still try to
extend the current parser. This is not too difficult, but requires time for 
testing.

Wolf Lammen

Original comment by ooka...@gmx.de on 20 Jul 2007 at 4:06

GoogleCodeExporter commented 9 years ago
Ariya: could that new parser easily support definition of functions (most 
important,
like "myFunc(x;y)=average(x;y)*0.5" ) and lists for matrix definition and such 
(like
"myMat=[[1;2];[3;4]]", "det(myMat)" or "det([[1;2];[3;4]]") ?

Original comment by helder.p...@gmail.com on 21 Jul 2007 at 4:45

GoogleCodeExporter commented 9 years ago
Wouldn't it be easier to use an automatically generated parser, using 
lexx+bison?

Original comment by shambler...@gmail.com on 22 Jul 2007 at 12:15

GoogleCodeExporter commented 9 years ago
I'm not sure about the percent operator of Speedcrunch, which is treated 
specially.
All other expressions can be handled by bison. IMHO, if possible, one should 
switch
over to shamblers suggestion. Maybe I find some time next week to give it a try.

Wolf Lammen

Original comment by ooka...@gmx.de on 22 Jul 2007 at 9:09

GoogleCodeExporter commented 9 years ago
I'm making no more last hour changes on the parser for this release as it could 
break
something, so let's continue this discussion for 0.9 and I'm totally up for 
Yonatan's
(shambler) proposal (if someone knows how to deal with bison or similar tools).

Original comment by helder.p...@gmail.com on 22 Jul 2007 at 9:56

GoogleCodeExporter commented 9 years ago
Lex is not needed because we have already a working and fast tokenizer. Bison is
quite showing its age and not suitable for C++ (although there is C++ version 
of it).
Probably Lemon is the best possible choice since it's fantastically simple. Of 
course
there are tons of other parser generator.

In general math expression grammar is not too difficult, so it's even possible 
to
write the parser by hand, just like I have done for what we have now. The 
problem  is
because I did not write the parser taking the latest features possibilities in 
mind,
and that's why it's difficult to extend the current parser.

Personally, even with generated parser, I'd like to wait a bit until QLALR gets 
more
polishes and/or even included in Qt.

Original comment by ariya.hi...@gmail.com on 24 Jul 2007 at 5:48

GoogleCodeExporter commented 9 years ago
I already looked at flex and bison, and for the scanner part, I was not so 
impressed.
Essentially it looks for regular expressions. But all regular expressions 
SpeedCrunch
needs so far are very simple, so I don't think the overall effect justifies the
inclusion of the extra library package flex is built upon. 
That does not hold for bison. bison is definitely powerful enough to be of use. 
It's
right what Ariya said, bison parsers expect a C interface, so scanner and 
evaluator
have to be "downgraded" to work with this parser. But the goodies are 
impressive: For
instance, you can have a C++ parser right away. And in general, bison supports 
two
language classes: LALR(1) (is that what QLARL is derived one?) and a speculative
parsing algorithm that claims to deal with a vast subclass of deterministic
context-free (only) languages. That's more than SpeedCrunch needs in the 
foreseeable
future.
If you want to change your grammar, updating a configuration file, is all you 
need to
do. The problem that created this thread would have been fixed and tested in 
less
than an hour. 
So bison is an option (maybe not the best). There is a bison++ version that 
interacts
with C++, but I had no time to look at it.
Wolf Lammen

Original comment by ooka...@gmx.de on 24 Jul 2007 at 8:03

GoogleCodeExporter commented 9 years ago
As I wrote before, we can forget about (f)lex since we have a scanner already.

QLALR is a parser for LALR. It's nice, quite fast and nicely integrated with Qt 
(i.e.
use QString etc). More on QLALR: 
http://labs.trolltech.com/page/Projects/Compilers/QLALR

And if you like bison, 99% you'd like lemon even more.

I guess if somebody wants to continue this, better create a branch where the
replacement code is being worked on. Then we can have different branches for
different approach (hand-written, bison, QLALR, lemon). Decision on which 
back-end
will be used can be determined later after comparing how they work.

Original comment by ariya.hi...@gmail.com on 26 Jul 2007 at 1:24

GoogleCodeExporter commented 9 years ago
In one respect bison is superior to lemon or QLALR, because it can deal with
non-deterministic, non-LALR(1) languages. Unfortunately, C, C++, Pascal are 
examples
of such languages.
For now, and the near future (say a year) any of the above parser generators 
will do.
Wolf Lammen

Original comment by ooka...@gmx.de on 27 Jul 2007 at 6:49

GoogleCodeExporter commented 9 years ago
I think I'm going to give this bison++ a shot, just to create an appropriate 
grammar 
at first. I also think flex shouldn't be ignored since using it will improve 
maintainability by allowing easy introduction of new tokens.

Original comment by shambler...@gmail.com on 30 Jul 2007 at 11:45

GoogleCodeExporter commented 9 years ago
I already checked bison++ out. It is based on an old version of bison, it 
crashes
often, does not support all keywords of bison, lacks some documentation. It 
took me
some time (and guessing) to get a complete parser run.
Actually I don't see a single advantage of bison++ over a modern parser 
generator
like lemon. So I discourage using this program, I deleted that program from my 
hard
disk already.
Wolf Lammen

Original comment by ooka...@gmx.de on 30 Jul 2007 at 5:45

GoogleCodeExporter commented 9 years ago

Original comment by helder.p...@gmail.com on 3 Nov 2007 at 7:41

GoogleCodeExporter commented 9 years ago

Original comment by helder.p...@gmail.com on 6 Jan 2008 at 1:49

GoogleCodeExporter commented 9 years ago

Original comment by helder.p...@gmail.com on 20 Feb 2008 at 10:14

GoogleCodeExporter commented 9 years ago
implemented makeshift solution from July 2007. Still annoying: a function call 
takes
precedence over the postfix operator !, but that would be hard to fix in the 
old parser

Original comment by wolf.lam...@googlemail.com on 21 Mar 2008 at 4:01

GoogleCodeExporter commented 9 years ago

Original comment by helder.p...@gmail.com on 19 Jul 2008 at 6:23

GoogleCodeExporter commented 9 years ago
Extended related issues mentioned above will be handled separately.

Original comment by helder.p...@gmail.com on 12 Oct 2013 at 6:54