Closed buuug7 closed 4 years ago
Duplicate of #2603 but this is a better issue report so any discussion will be here instead
@buuug7 while your suggestion of the URL generator is appreciated, I want to add extra dependencies only as a last resort. How does Laravel handle unicode file name issues? What can we do with the dependencies that we already have available to us? What are the concerns with saving non-latin characters to the filesystem?
Some of (not so recent) browsers on very popular OS still have trouble displaying images with spaces in URIs so it would be great to at least slugify filenames for uploaded Media Manager files - Laravel's built-in Str::slug works fine for that purpose with all Latin languages.
@Eoler agree
@Eoler @buuug7 what about for non-latin characters? I'm fine with the slugifying, would prefer a solution that keeps the unicode characters present if possible without adding extra dependencies
Don't know how are non-latin characters translated to slugs, maybe @buuug7 can elaborate. Also, not sure I really wan't my CMS users to upload media files with Unicode names...
@LukeTowers @Eoler , Unicode contains more characters than ASCII and Latin, it represent future. For example, Utf8 is the most widely used on the Internet a Unicode implementation approach, which is designed for transmission codes, and makes the code without borders, so you can show that the all cultural characters in the world.
@LukeTowers Laravel Str::slug only works with Latin laguage. For brevity, it does not provide additional Unicode support. if a developer needed Unicde support unless he find a solution. this keep laravel framework lighter weight.
@LukeTowers There is no need to implementation with october. there is no need reinvent the wheel. Besides, it takes extra effort to maintain this feature, and I think it's necessary to use the third-party packages.
@buuug7 Str::ascii()
exists in Laravel, can we use that instead?
@LukeTowers tested, seems it not work.
Str::ascii('Hello Wörld!'); // "Hello World!"
Str::ascii('Καλημέρα'); // "Kalhmera"
Str::ascii('фильм'); // "film"
Str::ascii('富士山'); // ""
Str::ascii('富士山', 'zh-CN'); // ""
Str::ascii('國語'); // ""
Str::ascii('國語','zh-TW'); // ""
@buuug7 is the dependency you suggested going to introduce new system requirements for October? https://github.com/ausi/slug-generator/blob/master/composer.json? They require:
"ext-intl": "*",
"ext-mbstring": "*",
"ext-pcre": "*"
Also, would it make more sense to load this dependency into the October Library and then utilize it within October's Str::slug method?
@LukeTowers yes, load this denpendency into October Library and packaged with Str::slug
method is very good. i try it.
In Backend\Widgets\MediaManager.php file I have change regex with desired Unicode range. And it does work.
protected function validateFileName($name)
{
// added our language's unicode in regex
if (!preg_match('/^[\p{Bengali}0-9a-z@\.\s_\-]+$/i', $name)) {
return false;
}
if (strpos($name, '..') !== false) {
return false;
}
return true;
}
@SadiqUltra you shouldn't change core files, those changes will be removed next time you update.
@LukeTowers yes, but for now its only option that works, I guess.
@buuug7 did you fix it?
Any update on this? Some other plugins rely on the paths being a-zA-Z0-9...
@maartenmachiels nope. Feel free to propose a change to the regex if you wish, just keep in mind that it won't be accepted unless it solves the original problem (illegal characters allowing for XSS attacks or directory traversal) and works for all of October's user base.
This issue will be closed and archived in 3 days, as there has been no activity in the last 30 days. If this issue is still relevant or you would like to see action on it, please respond and we will get the ball rolling.
I just tested uploading jpg files with UTF-8 encoded filenames and it works fine on my setup.
Here's the Media Library after the upload
I suspect the problem might be with the Characterset used by the user's Browser that may lead to the problem.
I used a different browser on my system and was able to replicate the issue... so browser configuration definitely has a role... more data to come
With Firefox, this setting makes it work:
intl.fallbackCharsetList.ISO-8859-1 = windows-1252
With Chrome, this extension is needed:
https://www.deep-watch.net/en/blog/bring-character-encoding-menu-back-to-chrome-with-extension
Actual link for the extension:
https://chrome.google.com/webstore/detail/set-character-encoding/bpojelgakakmcfmjfilgdlmhefphglae
@mjauvin is there anything that can be done serverside to fix this? Detect the browser's encoding? Tell the browser what encoding to use? What sort of options do we have?
That's what I'm trying to find out, so far only client-side changes do the trick.
AFAIK, we already tell the browser which encoding to use in the form tag using:
accept-charset="UTF-8"
(ref. vendor/october/rain/src/Html/FormBuilder.php
)
And also the <meta charset="utf-8">
in the meta tags (ref. modules/backend/layouts/_head.htm
)
But there is something I am missing
What's really odd is that I cannot reproduce at will. I don't know if there is some caching involved or not, but even without changing client-side charset encoding, I SOMETIME get the empty file result.
This issue will be closed and archived in 3 days, as there has been no activity in the last 30 days.
This issue will be closed and archived in 3 days, as there has been no activity in the last 30 days.
Description
When upload filename with Unicode characters, MediaManager worked very bizarre. For example, we upload a file with
波比.jpg
name, the MediaMange prompt upload success, but we could not found it in the MediaManage. actually, it upload波比.jpg
and rename it with.jpg
name, only contain the the dot and file extension.Screenshot
suggestion
use the slug-generator library for generate the filename with URL Slug, the library will try to convert characters to ASCII, and then save to the filesystem. this will solve mutiple laguage named filename uploaded problems.
my solution
Modified the
Backend\Widgets\MediaManager
checkUploadPostback()
method as below:result