greenlion / PHP-SQL-Parser

A pure PHP SQL (non validating) parser w/ focus on MySQL dialect of SQL
BSD 3-Clause "New" or "Revised" License
603 stars 156 forks source link

A bug with the Ref Type builder for USING clause; Easy Fix #270

Closed infinite-system closed 1 year ago

infinite-system commented 6 years ago

Spotted a bug with the Ref Type builder for USING clause `class RefTypeBuilder {

public function build($parsed) {
    if ($parsed === false) {
        return "";
    }
    if ($parsed === 'ON') {
        return " ON ";
    }
    if ($parsed === 'USING') {
        return " USING ";
    }
    // TODO: add more
    throw new UnsupportedFeatureException($parsed);
}

}`

return " USING "; should not have space at the end, it should be return " USING";

Because the parser cannot parse USING (Column) it can only do USING(Column)

infinite-system commented 6 years ago

Actually, I just realized that the mistake is with the parser, it should be able to parse USING (Column) but it fails, because I was building using the Builder and then parsing, it failed, so fixing the builder fixed the parsing, but the mistake is not in builder, because SQL syntax allows USING (Column) with space after USING, so the parser should account for possible space after USING, but it does not.

infinite-system commented 6 years ago

For sample query : SELECT activity_log.ActivityId,activity_log.PageId,activity_log.Action,activity_log.Location,activity_log.Ip,activity_log.OccuredTimeStamp FROM activity_log JOIN pages USING (PageId)

Parser gives error: cannot calculate position of within activity_log JOIN pages USING (PageId) and then it cannot parse PageId into ref clause, that ref clause is empty.

infinite-system commented 6 years ago

Ok I found the parser mistake and made a fix: in File /processors/FromProcessor.php UPDATE: i use another solution now, see below comment

`protected function processFromExpression(&$parseInfo) { $res = array();

    // exchange the join types (join_type is save now, saved_join_type holds the next one)
    $parseInfo['join_type'] = $parseInfo['saved_join_type']; // initialized with JOIN
    $parseInfo['saved_join_type'] = ($parseInfo['next_join_type'] ? $parseInfo['next_join_type'] : 'JOIN');

    // we have a reg_expr, so we have to parse it
    if ($parseInfo['ref_expr'] !== false) {
        // *************** space after USING fix  *********************
        // trim the space after USING before bracket
        $trimmed_ref_expr = trim($parseInfo['ref_expr']);
        $unparsed = $this->splitSQLIntoTokens($trimmed_ref_expr);

        // here we can get a comma separated list
        foreach ($unparsed as $k => $v) {
            if ($this->isCommaToken($v)) {
                $unparsed[$k] = "";
            }
        }
        if ($parseInfo['ref_type'] === 'USING') {
            // unparsed has only one entry, the column list
            $ref = $this->processColumnList($this->removeParenthesisFromStart($unparsed[0]));

            $ref = array(array('expr_type' => ExpressionType::COLUMN_LIST, 'base_expr' => $unparsed[0], 'sub_tree' => $ref));
        } else {
            $ref = $this->processExpressionList($unparsed);
        }
        $parseInfo['ref_expr'] = (empty($ref) ? false : $ref);
    }

    // there is an expression, we have to parse it
    if (substr(trim($parseInfo['table']), 0, 1) == '(') {
        $parseInfo['expression'] = $this->removeParenthesisFromStart($parseInfo['table']);

        if (preg_match("/^\\s*select/i", $parseInfo['expression'])) {
            $parseInfo['sub_tree'] = $this->processSQLDefault($parseInfo['expression']);
            $res['expr_type'] = ExpressionType::SUBQUERY;
        } else {
            $tmp = $this->splitSQLIntoTokens($parseInfo['expression']);
            $unionProcessor = new UnionProcessor();
            $unionQueries = $unionProcessor->process($tmp);

            // If there was no UNION or UNION ALL in the query, then the query is
            // stored at $queries[0].
            if (!empty($unionQueries) && !UnionProcessor::isUnion($unionQueries)) {
                $sub_tree = $this->process($unionQueries[0]);
            }
            else {
                $sub_tree = $unionQueries;
            }
            $parseInfo['sub_tree'] = $sub_tree;
            $res['expr_type'] = ExpressionType::TABLE_EXPRESSION;
        }
    } else {
        $res['expr_type'] = ExpressionType::TABLE;
        $res['table'] = $parseInfo['table'];
        $res['no_quotes'] = $this->revokeQuotation($parseInfo['table']);
    }

    $res['alias'] = $parseInfo['alias'];
    $res['hints'] = $parseInfo['hints'];
    $res['join_type'] = $parseInfo['join_type'];
    $res['ref_type'] = $parseInfo['ref_type'];
    $res['ref_clause'] = $parseInfo['ref_expr'];
    $res['base_expr'] = trim($parseInfo['expression']);
    $res['sub_tree'] = $parseInfo['sub_tree'];
    return $res;
}`
infinite-system commented 6 years ago

Another solution that i am using now to keep the rest of parser intact is not to trim, but change the `if ($parseInfo['ref_type'] === 'USING') { // unparsed has only one entry, the column list $ref = $this->processColumnList($this->removeParenthesisFromStart($unparsed[0]));

        $ref = array(array('expr_type' => ExpressionType::COLUMN_LIST, 'base_expr' => $unparsed[0], 'sub_tree' => $ref));
    } else {
        $ref = $this->processExpressionList($unparsed);
    }`

to this

` if ($parseInfo['ref_type'] === 'USING') {

            $ref = $this->processColumnList($this->removeParenthesisFromStart($parseInfo['ref_expr']));
            $ref = array(array('expr_type' => ExpressionType::COLUMN_LIST, 'base_expr' => $parseInfo['ref_expr'], 'sub_tree' => $ref));
        } else {
            $ref = $this->processExpressionList($unparsed);
        }

`

Because USING expects from comment: // unparsed has only 1 entry, the column list we can pass the whole ref_expr without tokenizing it and then passing unparsed[0] we pass the whole $parseInfo['ref_expr']

this could be a cleaner solution as i do not know all implications of trimming before tokenizing

What do you think? @witchi