RussellEngland / php-sql-parser

Automatically exported from code.google.com/p/php-sql-parser
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Position for CLAUSES #151

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Parse any SQL query
2. Clauses that appear as KEYs in the array are not treated as NODES.
They neither have positions nor anything else. 

What is the expected output? What do you see instead?

I'm trying to mark all tokens in another string that simplifies the query, but 
major CLAUSES like SELECT and WHERE don't support position or other 
information. They should be treated like other nodes.

What version of the product are you using? On what operating system?
OS X Yosemite php-sql-parser-20140108

Please provide any additional information below.

Original issue reported on code.google.com by abi...@owasp.org on 25 Sep 2014 at 9:05

GoogleCodeExporter commented 8 years ago
You can try to use the existing positions and look forward/backward from the 
known position to find the keyword. I'll set this as enhancement, it is a 
larger change. I have to modify the array a lot (additional levels), which can 
have side-effects on existing client code.

Original comment by pho...@gmx.de on 26 Sep 2014 at 3:31

GoogleCodeExporter commented 8 years ago
There is an SQLChunkProcessor class, where you can find all the keywords, which 
are not part of the calculator. So the first thing you can do, is to enhance 
this class, and add a new level just below each of the keywords like

if (!empty($out['SELECT'])) {
            $processor = new SelectProcessor();
            $out['SELECT'] = array('expr_type'=>ExpressionType::RESERVED, 'base_expr'=>'', 'sub_tree'=>$processor->process($out['SELECT']));
        }

You always need an expr_type, which is used in the PositionCalculator to differ 
between type with/without backtracking. The base_expr contains the original 
string of the statement. So maybe it is better to enhance the processors 
directly (you can have 'select' or 'SeLeCt' or 'SELECT'). In the sub_tree 
section, you should add the rest of the parsed statement part.

But there are side-effects. In example the top-level SELECT key is used in 
other Processors, so you have change these things to access the right new 
sub-level ([SELECT][0][sub_tree] instead of [SELECT]).

I think, there is a lot to do, maybe you should think about the other solution:

1. hold the current position
2. if you get a keyword index, make a strpos(strtoupper($index), $curPos) to 
find the next matching position
3. in all cases use the [position] and store it always in the $curPos

Original comment by pho...@gmx.de on 26 Sep 2014 at 5:49

GoogleCodeExporter commented 8 years ago
2. must be strpos(strtoupper($statement), $index, $curPos)

The $index should always be in uppercases.

Original comment by pho...@gmx.de on 26 Sep 2014 at 5:58

GoogleCodeExporter commented 8 years ago
It can be necessary to add the strlen([base_expr]) to the $curPos, just before 
you search the index name. The [position] gives you always the start of the 
[base_expr], so if you store the top-level position, you have to add the length 
of the statement part associated with it. This allows you to start the search 
right after [base_expr] and there should be only some withespace before the 
index keyword starts.

Original comment by pho...@gmx.de on 26 Sep 2014 at 6:08

GoogleCodeExporter commented 8 years ago
You got me totally confused with these comments.
All I need is to have base_expr and position entries in the high-level keywords 
too, just like the rest of the nodes. 
-A

Original comment by abi...@owasp.org on 26 Sep 2014 at 6:48

GoogleCodeExporter commented 8 years ago
But to be conform with the rest of the parser output, i think you should insert 
a new level below SELECT. 

This has side effects to the position calculator and some other parts of the 
parser. So i think that it could be simpler to calculate the positions of the 
keywords like WHERE or UPDATE with the already existing positions. 

In example: you get the position of the whole FROM clause. The next clause is 
the WHERE clause. Now you have two possibilities. You can use the position of 
the FROM clause, add the length of the FROM base_expr and start from this 
position the search for "Where". Or you can get the first position under the 
WHERE key and looks backward for "Where".

Original comment by pho...@gmx.de on 26 Sep 2014 at 8:51

GoogleCodeExporter commented 8 years ago
I don�t need it to conform to the rest of the parser as long as it satisfies my 
need :D Can you let me know where I should change to achieve that?
-A

Original comment by abi...@owasp.org on 26 Sep 2014 at 8:57

GoogleCodeExporter commented 8 years ago
I tried a lot of traversing, it doesn't seem to work. For example, in the case 
of a UNION, the first subtree is before it and the second after it! It made my 
code real nasty and I bet it would stop working at a lot of points.

So now I am planning to modify PositionCalculator.php to add 'position' to all 
entries, not just those that already have it, but I can't seem to figure out 
what its planning. I don't see where it skips SELECT and UNION and similar 
clauses, so that I can add it there.

Thanks for the help,
-A

Original comment by abiusx on 7 Oct 2014 at 6:55