Closed pako81 closed 1 year ago
I don't think this is the right approach. What if the file has a different encoding? The second iconv
will still fail and we'll have the same problem.
Maybe we should send the raw content (without the iconv
), consider the file as read-only (to prevent issues with saving the file with a different encoding) and show an error about an unexpected encoding.
I've tried sending the raw file: The texteditor window remains blank, and owncloud hangs. Multiple decode attempts would make more sense to me.
The situation is, that mb_detect_encoding() returns something wrong.
Thus the subsequent iconf is likely to fail then(). Instead of a hard fallback to ISO-8859-15, we could at least try all the encodings one by one, that we have in the list for mb_detect_encoding() the first one that does not return false, is probably right.
This reduces the chance of permanent failure. If all attempts return false, I'd run a final \iconv("ISO-8859-15", "UTF-8//TRANSLIT", $fileContents);
Yes, passing the raw file also did not work for me.
Instead of a hard fallback to ISO-8859-15, we could at least try all the encodings one by one, that we have in the list for mb_detect_encoding() the first one that does not return false, is probably right.
Is \mb_detect_encoding
not already going through the whole list of possible encodings passed at https://github.com/owncloud/files_texteditor/blob/master/controller/filehandlingcontroller.php#L153 ?
Quoting https://www.php.net/manual/en/function.mb-detect-encoding.php: A list of character encodings to try, in order. The list may be specified as an array of strings, or a single string separated by commas.
Maybe rather setting strict
to false
so that the closest encoding will be returned? Not sure about any side effects however.
Or we just throw an error message when trying to open a file which could not be encoded, something like: return new DataResponse(['message' => (string)$this->l->t('File encoding could not be detected.')], Http::STATUS_BAD_REQUEST);
From my point of view, the best option is:
@jvillafanez implemented your suggestion. Not sure this is exactly what you meant.
Yes, that's the idea.
This should do the same, and it's more compact.
if ($encoding !== 'UTF-8') {
$writeable = false;
$fileContents = \iconv($encoding, "UTF-8", $fileContents);
}
if ($fileContents !== false) {
return new DataResponse([
'filecontents' => $fileContents,
'writeable' => $writable,
'locked' => $activePersistentLock ? $activePersistentLock->getOwner() : null,
'mime' => $mime,
'mtime' => $mTime
], Http::STATUS_OK);
} else {
return new DataResponse(['message' => (string)$this->l->t('Cannot convert the encoding to UTF-8.')], Http::STATUS_BAD_REQUEST);
}
@jvillafanez thx, done.
Kudos, SonarCloud Quality Gate passed!
Yes, passing the raw file also did not work for me.
Instead of a hard fallback to ISO-8859-15, we could at least try all the encodings one by one, that we have in the list for mb_detect_encoding() the first one that does not return false, is probably right.
Is
\mb_detect_encoding
not already going through the whole list of possible encodings passed at https://github.com/owncloud/files_texteditor/blob/master/controller/filehandlingcontroller.php#L153 ?Quoting https://www.php.net/manual/en/function.mb-detect-encoding.php:
A list of character encodings to try, in order. The list may be specified as an array of strings, or a single string separated by commas.
Yes, mb-detect-encoding() is documented to do that. But in 7.4 it is broken. That Is why I suggest to brute force all the encodings from the list in a loop with iconv().
The suggestion from @jvillafanez a) does not make any further attempts, when the first iconv fails. b) suffers from #411
you are missing unit tests .... people please .... https://www.reddit.com/r/learnprogramming/comments/tg9yog/why_write_unit_tests/
In some cases,
\iconv
may returnfalse
because of wrongly detected encoding. This messes up with the file content. We now set the encoding back to default ISO-8859-15 (after retrieving again the node content) in case of encoding being wrongly detected.Related issue: https://github.com/owncloud/files_texteditor/issues/405