Open whooperlove opened 10 months ago
not
is a reserved word for Zephir and PHQL. As such, the parser does not know if this is part of your filed name or an actual word used in the framework.
Change your findFirst
like this:
$notif = Notif::findFirst(['[notif_code] = 0']);
Enclosing the field in square brackets will tell the parser that this is the name of a field and not part of the statement.
Thanks for your feedback. Normally, keywords are separated by spaces. For instance, in cases like 'notice' where 'not' and 'ice' are together without spaces, there shouldn't be an issue. Also, in Phalcon 4, there wasn't such a problem as it is now.
Even though it is a reserved word, it is very troublesome if it is determined by partial match. I also think it's an incorrect implementation.
I don't disagree. PHQL was written 10 years ago and at the time the implementation had difficulties distinguishing between keywords and names of variables/fields. The solution was the square brackets.
These limitations are the reason we (slowly) started introducing the Data Mapper pattern for the database. Still a lot of work left though.
I feel very sorry for everyone working on this, but I would like you to agree that it is incomplete, no matter how many years ago it is.
And, "now" we need to make it known that all column names need to be escaped with [
and ]
, and I think the reason why this issue was raised is because there wasn't enough awareness.
You mentioned that the problem has been present since the implementation of PHQL 10 years ago, but I find two peculiar points. First, when accessing notif_code in the format of a magic method, such as 'Notif::findFirstByNotifCode(0),' no issues seem to arise. Perhaps it internally handles the square brackets [] differently? Second, as mentioned in the previous comment, there were no problems with the same code in Phalcon version 4. My project was created before this issue emerged, and it only occurred after applying Phalcon version 5, making me aware of the need for reserved word escaping.
The reason these bugs arise or if you like difference in behavior, is the changes in PHP and their internal engine. We all had to adjust to new functionality and adjusted our methods. This one appeared with PHP 8 it seems.
Note that you don't need to enclose all your field names with square brackets. The ones prefixed with not
are the most common ones (note
, notes
, notifications
etc.)
I don't know C so I cannot dive in and diagnose and potentially fix this. If anyone does and can help by all means, we welcome such help. In the meantime the brackets are the workaround or "feature" :)
Note that you don't need to enclose all your field names with square brackets.
No, as a workaround today, isn't it sure to take measures without omission?
I haven't tried it, but wouldn't it target all words that follow normal SQL reserved words such as or
, and
, select
, delete
?
Also, I don't think there will be a problem if the column name that seems to be a problem this time is passed directly by PDO, so maybe this is also a problem with Zephir?
The problem is in the generated code that builds the extension. Whether that is Zephir or the pure C code that handles the PHQL queries I do not know.
I have not seen any submissions for fields starting with delete
say deletedTime
, only the not
ones.
I took a look at scanner.c, but to be honest, I don't really understand it. Why is the processing different for "AS" and "NOT", and why is there a difference between "X" and "x" even though I thought it was case insensitive...
@s-ohnishi I don't know about phalcon internals. but I think you don't have to read scanner.c file. It seems scanner.c file was auto-generated from scanner.re file in the same directory. scanner.re file is readable code.
@s-ohnishi I just found that scanner.c was auto-generated from scanner.re file by using re2c program. The following text is about re2c.
===== RE2C is a lexer generator designed to process regular expressions and generate C code for recognizing and handling tokens. However, it does not provide an explicit feature for optimizing regular expressions. Instead, the main goal of RE2C is to generate simple and efficient C code.
Key features of RE2C include:
Performance: The code generated by RE2C is highly efficient and fast, providing high-performance recognition of regular expressions.
Flexibility: RE2C supports various regular expression syntaxes and can be used in multiple programming languages.
Ease of use: Defining lexer code using a simple regular expression syntax and a few special commands is straightforward.
While RE2C does not offer explicit options for optimizing regular expressions, it is designed to automatically generate optimized code. If specific optimization for a particular regular expression is needed, developers typically modify or fine-tune the RE2C code directly.
@whooperlove Thank you very much.
I understand that re2c
converts regular expression rules into concrete code.
I think that case sensitivity is specified by the option --case-insensitive
, but why is it that some of the generated code is not so?
I'm not sure if scanner.re
is wrong or if re2c
is having the problem.
And my doubts increased.
Why is there a difference between BETWEEN
and NOT BETWEEN
?
Even though PHQL_T_NOTIN
is defined for NOT IN
, why is it not written in scanner.re
?
I think NOT EXISTS
is used in a similar way, but why is it completely ignored?
If NOT IN
and NOT EXISTS
do not need to be distinguished as tokens, shouldn't NOT BETWEEN
be the same?
@s-ohnishi for your information, it seems that scanner.re and scanner.c are unrelated to the problem. (I modified scanner.re by removing the 'NOT' block and regenerated the scanner.c file by running 're2c -o scanner.c scanner.re'. After building and installing the new phalcon.so, when I run the test code that previously resulted in the tif_code error, the same error occurs with the new phalcon.so.)
@whooperlove That's an interesting result. It's a shame that I can't experiment with it at hand.
For example, even if I delete everything such as NOT BETWEEN
and NOT
, there will be no problem as long as there is no NOT
in the SQL, so I would like to try various things.
the same error occurs with the new phalcon.so.
Maybe I'm thinking the wrong way.
Even though the token separation is working fine, something may be happening elsewhere.
If that's the case, the mysterious parts such as scanner.re
and scanner.h
are probably irrelevant.
regenerated the scanner.c file by running 're2c -o scanner.c scanner.re'
I'm also interested in how much the generated scanner.c
has changed.
@whooperlove I tried it too.
I created a debian:bookworm container with Docker and compiled it from the v5.6.0 source.
I tested simple-mvc with a little modification.
When I add a notice column to the Products table and execute $prod = Products::find('notice=1');
,
Column 'ice' doesn't belong to any of the selected models (1), when preparing: SELECT [Products].* FROM [Products] WHERE notice=1
was displayed.
After that, I modified ext/phalcon/mvc/model/query/scanner.re as follows.
Removed NOT BETWEEN
clause.
Although this is not the case, the regular expression for hexadecimal integers is strange, so I changed it to the following.
HINTEGER = [0][x][0-9A-Fa-f]+;
HINTEGER {
token->value = estrndup(q, YYCURSOR - q);
token->len = YYCURSOR - q;
token->opcode = PHQL_T_HINTEGER;
q = YYCURSOR;
return 0;
}
(In [x0-9A-Fa-f]+
, 0xxxx is also considered a hexadecimal integer, right? However, I am not sure if this description is appropriate.)
After that, I generated scanner.c with re2c, compiled it again, and tried it, and it ran without any problems.
I don't know why there was no change in your test, but I think it's because the NOT BETWEEN
clause is getting in the way, or because re2c is up to date (the one I used was re2c3.2).
@niden Could you please try removing the NOT BETWEEN
clause from scanner.re (by changing the HINTEGER regular expression if possible)?
I think that, due to our lack of knowledge in solving this issue, it might be better to include a warning message in the official documentation, specifying certain reserved keywords including 'not'. This warning would advise using square brackets for escaping when accessing columns with names containing the reserved keywords. for example, at https://docs.phalcon.io/latest/db-models/#findfirst, "Please escape column name with square brackets if it contains reserved keywords like 'not', ..."
@niden @whooperlove
I agree that users use [ ] as a precaution, but I disagree that the system cannot function without [ ].
This is clearly different from the original behavior of SQL.
And there is no reason why NOT BETWEEN
should be treated specially as a token, so scanner.re
should be modified.
(HINTEGER
should also be fixed)
Just run into this issue and wish to bump its importance. +1 to fixing it asap.
Migrating from Phalcon 3 to Phalcon 5 with queries like this:
'conditions' => 'publisherId = :publisherId: AND noteId = :noteId:',
And hilarious errors:
Column 'eId' doesn't belong to any of the selected models (1)
What? At least I laughed a bit when found out what's going on. But newcomers will be lost.
At least I laughed a bit when found out what's going on.
I really think so.
In my test, I was able to solve the problem by modifying scanner.re
, so I would like someone to verify this.
+1 Anyone who has used Phalcon for an extended period has likely encountered this issue at least once. It might be beneficial to include an example of this specific problem in the documentation or resolve the issue and incorporate this scenario into the unit tests.
It looks like the developers are focusing more on v6 than v5. Since I can't prepare a v5 development environment, I can't issue a specific PR for files under ext/ or .zep files. So I know I have to wait until they have enough time.
I've presented a proposal that I think will solve the problem, and I hope the developers will start working on it soon.
I think there is a consensus that the strange specification that requires escaping of all column names that may be mistaken for reserved words should be fixed.
Since Zephir has also been updated, I hope that scanner.re
will also be fixed and reflected in the fixed version of v5.7.0 (which contains incomplete parts).
Questions? Forum: https://phalcon.io/forum or Discord: https://phalcon.io/discord
Describe the bug I encountered an error when passing a search query with the string 'notif_code' as an argument to the findFirst method of the Phalcon\Mvc\Model object. The error message is as follows:
"Column 'if_code' doesn't belong to any of the selected models (1), when preparing: SELECT [Notif].* FROM [Notif] WHERE notif_code = 5 AND account_id = :account_id: AND other_account_id = :other_account_id: LIMIT :APL0:"
Since the actual column 'if_code' does not exist in the table, it seems that within the implementation of the findFirst method, there might be a bug where the string 'notif_code' is modified, possibly removing the 'not' keyword for processing in SQL. As a result, 'notif_code' may have been transformed into 'if_code,' leading to the passing of a nonexistent column name 'if_code' in the generated low-level SQL statement.
To Reproduce If you add a column containing the word 'not' to any table and specify it as a search condition in the findFirst method, you will notice an attempt to access the column name with the 'not' removed. For example, the column name 'notification' would be recognized as 'ification' within the findFirst method.
Expected behavior The error, 'Column 'if_code' doesn't belong to any of the selected models,' should not occur.
Details