Closed fmkaiser closed 1 year ago
@fmkaiser thanks for the report. Not sure if error handling is really needed, we can eventually just set the encoding back to default (after retrieving again the node content) in case \iconv
returns false
because of wrongly detected encoding.
So something like this after https://github.com/owncloud/files_texteditor/blob/master/controller/filehandlingcontroller.php#L158:
// safety net in case \iconv returns false because of wrongly detected encoding
if ($fileContents === false) {
$fileContents = $node->getContent();
$fileContents = \iconv("ISO-8859-15", "UTF-8", $fileContents);
}
This seems to make files_texteditor to correctly deal with viele_gruesse_ansi.txt
@jvillafanez @jnweiger what do you think?
As far as I can see, that is a bug in php7 mp_detect_encoding(). I've made a suggestion in https://github.com/owncloud/files_texteditor/pull/406#issuecomment-1803839122
@fmkaiser https://github.com/owncloud/files_texteditor/pull/406 has been just merged. It will be included in next files_texteditor app version.
One of our users reported that when opening some files saved under windows, "false" is returned as the content in the text editor:
It appears that if the encoding was not detected correctly before, the
iconv()
call incontroller/filehandlingcontroller.php
can fail, returning false, which is currently passed as the file content to the text editor.This is especially problematic when users then save the file, overwriting the original content.
With
PHP 7.4.33
this can be reproduced using a file with content "Viele Grüße" in WINDOWS-1252 (aka ANSI) encoding generated like this:viele_gruesse_ansi.txt
mb_detect_encoding
wrongly classifies the encoding asEUC-CN
, for which the iconv() call then fails withThe wrongly detected encoding may be a PHP issue, for comparison under PHP 8.1.2 (Ubuntu 22.04) the encoding is detected correctly as Windows-1252.
But given that ownCloud still requires PHP 7 it might be prudent to introduce better error handling e.g.
It might also be worth trying to decode again with the above defined ISO-8859-15 default encoding first.
If you think that makes sense, I am happy to submit a pull request.
Edit: Forgot to mention, this affects at least versions 2.5.2 and 2.6.0 of files_texteditor.
The problem was likely introduced or excaberated when the additional encodings were added in https://github.com/owncloud/files_texteditor/commit/d70f6c5d9d559b4f2cda8dbc95c74ecacaee8184