Open WangYihang opened 2 months ago
Additionally, I believe it would improve debugging if all non-printable characters were consistently marked for escaping.
If there is agreement, I would be happy to submit a pull request to implement this improvement.
Additionally, I believe it would improve debugging if all non-printable characters were consistently marked for escaping.
If there is agreement, I would be happy to submit a pull request to implement this improvement.
I'd say go ahead and submit a PR :)
It seems there’s already a related pull request [1-2] that attempted to address this issue. However, using php_addcslashes
to quote all non-printable characters isn’t ideal, as it requires a long string to specify the characters to be quoted.
I propose two potential solutions:
Solution I: Create a New Function: Develop a dedicated function similar to Python’s repr, specifically designed to handle all non-printable characters. While there is a var_representation
[3] function that appears to do something similar, its implementation is not present in the php-src
codebase, so we cannot reuse it.
Solution II: Extend php_addcslashes_str
: Modify the php_addcslashes_str
function to make the flags
variable more flexible, allowing it to automatically handle non-printable characters. This approach minimizes changes to the existing codebase while enhancing functionality.
@devnexen Which of these approaches do you think would be more suitable, or is there another direction you would recommend for implementing this functionality? @SerafimArts @dstogov
Implemented and passed tests in https://github.com/php/php-src/pull/15730 (Using Solution I), please kindly review, thanks for your time.
Which of these approaches do you think would be more suitable, or is there another direction you would recommend for implementing this functionality?
@WangYihang My pull request was aimed at providing the ability to programmatically read such output without parsing mistakes/errors.
I implement the lang server plugin, which showed the bytecode that PHP generates next to the code of functions and classes, which allows you to simply brute-force optimize time critical sections of code right in the IDE (since the opcode result is immediately visible, including optimization steps). And I simply couldn't find any other simple solution other than parsing this output)))
That is, at that time it was not critical to me whether there were line breaks or not, since this does not interfere with the programmatic parsing of the opcode output. The main thing is that it starts with a quotation mark and ends with one.
Description
Description:
While working with the zend_dump_op_array function in PHP's Zend Engine, I noticed that non-printable characters in string literals are not always represented in their escaped (readable) form. Currently, the function escapes ^1 only a limited set of characters by calling:
This can result in the direct output of characters such as newlines, tabs, and null bytes, which might make the dump less clear and harder to debug.
For example, for the following php code:
Current Output:
Expected Output:
I believe that modifying the function to escape all non-printable characters would enhance the clarity of the output. This could be achieved by updating the call to:
With this change, the output might look like:
Proposal:
I propose a small enhancement to the zend_dump_op_array function to ensure that all non-printable characters are properly represented in the output. This adjustment would improve readability and make debugging easier, without significantly altering the existing functionality.
Benefits:
Enhanced Readability: Ensures that non-printable characters are clearly represented, making the output easier to understand. Improved Debugging: Developers can quickly identify and interpret all characters within string literals. Minor Change: The proposed adjustment is a minimal change to the existing codebase but offers significant benefits.
I would greatly appreciate feedback from the community on this suggestion.
References