Closed michdingpayc closed 2 years ago
@michdingpayc Can you explain what you feel should be the behavior here? For the case of a null value, should all encoding types result in an empty string, or should they throw exceptions?
With that question answered, we can triage and determine if/how we will proceed.
The reason why null
still works as input us BC: if it were possible, I would totally enforce strictly requiring string
as input 🤔
Meanwhile, are you able to write a test case that highlights the issue with different encodings?
Thank you both for the stupendously quick responses.
It seems my project unconsciously relied on the behavior in which it returned an empty string for ISO-8859-1. Now, migrating the project for UTF-8 encoding, the inconsistency introduces a regression. With this in mind, my feeling is that the desired behavior is for a null value to escape to an empty string because this matches my previous unconscious assumption.
After some further digging, it seems the string
type is strictly enforced on the escaping methods, as of 2.8.0. This renders the issues fixed as of release 2.8.0.
You will please forgive me for asking before finding this information. However, it may be good to document this backwards incompatible, introduced in a commit here.
To get an idea of the inconsistency, you may add the following snippet to EscaperTest.php
from the 2.7.0 release:
/**
* I am simply recycling the provider used above.
* This should succeed at least once, for when we compare UTF-8 against UTF-8.
*
* Note, as of release 2.8.0, these methods require strict typing of string.
* ref - https://github.com/laminas/laminas-escaper/commit/b877686c78c836ecef905c943595bc30878774fe
*/
public function testNullReturnsSameValueForUTF8AndOtherSupportedEncodings()
{
foreach ($this->supportedEncodings as $encoding) {
$utf8Escaper = $this->escaper;
$nonUtf8Escaper = new Escaper($encoding); // in one data case, this will also be a UTF-8 escaper
$this->assertEquals($utf8Escaper->escapeCss(null), $nonUtf8Escaper->escapeCss(null));
$this->assertEquals($utf8Escaper->escapeHtml(null), $nonUtf8Escaper->escapeHtml(null));
$this->assertEquals($utf8Escaper->escapeHtmlAttr(null), $nonUtf8Escaper->escapeHtmlAttr(null));
$this->assertEquals($utf8Escaper->escapeJs(null), $nonUtf8Escaper->escapeJs(null));
$this->assertEquals($utf8Escaper->escapeUrl(null), $nonUtf8Escaper->escapeUrl(null));
}
}
@michdingpayc thanks for investigating nonetheless! 👍
Dear Laminas Team,
When passing a null value to the escaper with its encoding set to UTF-8, the user will see an exception thrown. When passing a null value to the escaper with its encoding set to ISO-8859-1, the user will receive an empty string.
I suspect this has to do with
Escaper::toUtf8()
method.It converts to UTF-8 if the given encoding is not UTF-8. If the given encoding is not UTF-8, it does not conversion. Then after any conversion, it checks if the result is valid UTF-8. It does not consider null to be UTF-8 and will throw an exception if the result is null.
However, it uses the
Escaper::convertEncoding()
method to perform the conversion if the given encoding is not UTF-8. This method relies onmb_convert_encoding
which (when givennull
and"UTF-8"
and"ISO-8859-1"
) will convert the null value to an empty string.I suspect similar behavior for other encodings being converted to UTF-8 by
mb_convert_encoding
.Will you please investigate and resolve the inconsistency?