wifang / mollify

Automatically exported from code.google.com/p/mollify
0 stars 0 forks source link

Upload errors #338

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm getting some weird upload errors when i'm uploading files which have 
letters "Ä, Ü, Õ, Ö" in them and the same bug probably happens with other 
"unsupported" symbols as well. The upload just stops and says "Error: -300, 
Message: IO Error, File: file_name_here". Perhaps this is something to do with 
unicode support?

Original issue reported on code.google.com by Logic...@gmail.com on 25 Dec 2011 at 3:49

GoogleCodeExporter commented 9 years ago
Ok, just a little update. The problem is elsewhere as well.
If i try to add a folder via Mollify and it's named "Jõulud" which in my 
language means "Christmas" then it adds it, but sometiems it appears as a some 
sort of file. Either way, under filesystem it makes weird symbols on the letter 
"õ". Can't use the folder as well, when i try to open it under Mollify, then 
it says "unknown error". Same happens if i want to delete it.

Original comment by Logic...@gmail.com on 25 Dec 2011 at 9:19

GoogleCodeExporter commented 9 years ago
Is this with Plupload uploader? I just tried myself, and all I got is that it 
removed all non-ascii letters. Does it work when you don't have any on these 
letters?

However, if filesystem shows weird letters, it sounds like your server 
filesystem does not operate with UTF-8. If you create a file with such name 
directly in your server (for example via ssh), and not via Mollify, how does it 
appear in Mollify?

Original comment by samuli.j...@gmail.com on 26 Dec 2011 at 10:58

GoogleCodeExporter commented 9 years ago
I'm using Pluploader yes. (I'm using apache and it's running under windows)
Yes, i made the same file twice, but from one i removed "ö, ä, ü, õ" 
letters, that one uploaded normally and without problems, the one which had 
those letters, threw in the -300 error while uploading.

When i create the "jõulud" folder directly on the filesystem, then when i try 
to navigate to the "jõulud" folder using mollify, mollify automatically throws 
in "unknown error" and i can't even see the folder there. If i remove the 
folder from the filesystem then i can navigate to the path normally. Then again 
randomly some other times, i can see the "jõulud" name, but mollify shows it 
as a file, not a folder and i can't delete it, i get unknown error, so the only 
way to delete it is to directly delete it under filesystem.

I'm pretty sure my windows filesystem supports UTF-8, i can make whatever 
symbols i like. Just under Mollify it has those weird bugs. Mollify, for some 
reason, changes the "jõulud" folder name to some weird letters on the "õ". 
This is what it shows under filesystem: "jƵulud".

I tried using the "convert_filenames" and used both to set it to either "TRUE" 
or to "CP1252". Still nothing.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 11:37

GoogleCodeExporter commented 9 years ago
Ok, small update, i managed to somehow make it work.

Here is what i did, i only added: "convert_filenames" => FALSE,
And this resulted that i can upload files which have ö, ä, ü, õ in it for 
example and when the file arrives to the server it removes the letters it 
"doesen't like".
Also i can create the folder "jõulud" for example, the real name under 
filesystem is still shown as "jƵulud" tho, but it seems it's working 
(sometimes it still throws some errors in).

What is weird tho, is that HOW does making that option to "FALSE" result in 
this kind of behaviour?

Original comment by Logic...@gmail.com on 26 Dec 2011 at 12:24

GoogleCodeExporter commented 9 years ago
Ok, while i can create "jõulud" folder, i can't operate with it, for example 
"move" files in it, it shows "Application configuration is invalid" and after 
that eventually "unknown error". So basically some sort of problem still 
remains here.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 12:33

GoogleCodeExporter commented 9 years ago
Yes, this is exactly what happens when server filesystem is not UTF-8. When the 
charset is different, PHP cannot read the filenames, or it reads them 
incorrectly (which eventually leads into error when it tries to locate the 
badly read file).

What you need to do is to determine what charset is your server using. The 
"CP1252" was just an example, don't use it unless you know it's correct, wrong 
charset in conversion won't help any more than no conversion at all. Conversion 
has only options "FALSE" = no conversion, "TRUE" = try to guess, or real 
charset code.

Original comment by samuli.j...@gmail.com on 26 Dec 2011 at 12:43

GoogleCodeExporter commented 9 years ago
And for the chars removed in upload, see 
https://groups.google.com/group/mollified/browse_thread/thread/b93ca60839edee05?
hl=en

Original comment by samuli.j...@gmail.com on 26 Dec 2011 at 12:44

GoogleCodeExporter commented 9 years ago
The "CP1252" should be the correct one under windows because my Apache, PHP and 
MySQL are all set to use charsets up to UTF-8.
I'm currently using: "convert_filenames" => "CP1252",
but the problem still remains.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 1:19

GoogleCodeExporter commented 9 years ago
I've tried so many charsets and a friend of mine says under windows i should 
use UTF-16, but when i set this into motion, i can't login anymore. When i try 
to log in, it says "folder does not exist" after pressing the login button.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 2:19

GoogleCodeExporter commented 9 years ago
I should be more precise, I suppose. Apache, MySQL or PHP charset is not 
relevant, they usually are configured to UTF-8. What matters here is the OS 
charset, because that's where the files are.

Unix systems are usually UTF-8 itself, so are Mac OS X systems. I guess Windows 
is the most difficult OS in here as well. I don't have Windows server, so I 
really can't test this.

Google is your best friend here, it tells me, among other things, that many 
russian windows OSes have charset "windows-1251", but hardly ever UTF-16.

Original comment by samuli.j...@gmail.com on 26 Dec 2011 at 3:24

GoogleCodeExporter commented 9 years ago
I have literally tried nearly all the possible charsets i found from google 
since the beginning of this discussion. None of them works, ALL of them 
"translate" õ to that weird symbols. I called to a friend of mine who is also 
working in a field like you are. After talking nearly an hour and me explaining 
the situation he immediately said this is not OS'es charset issue. I personally 
am also 99% sure this is mollify's bug.

If it would be OS'es charset issue, then i wouldn't be able to write "õ" in my 
OS as well. If i manually add the folder name containing "õ", then mollify 
either says unknown error, or from time to time, it shows the folder and i can 
see the "õ" in there. The problem occurs when i add the folder using mollify. 
If i remember correctly now, back in older version and using the same server 
and configuration, i did not have problems with that, i could upload files 
normally.
I also have other scripts running in my server and neither one of them have 
such a problem.

Perhaps it really is a charset issue, which i find hard to believe, but could 
you perhaps guide me to the correct charset table which i could try or give 
some options? I have depleted all my possible sources atm.

Thanks in advance!

Original comment by Logic...@gmail.com on 26 Dec 2011 at 4:26

GoogleCodeExporter commented 9 years ago
Confirmed, everything works fine with the mollify older version, the "õ" is 
translated to a weird symbol, but still, i can fully use everything without 
getting unknown errors. This really is mollify's bug.

I think the version i tested with and had on my drive was the 1.8.5.1 or 
1.8.5.0, something like that.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 4:33

GoogleCodeExporter commented 9 years ago
First of all, let's forget uploading for now. You have to first get file 
listing to show chars properly before you even think about uploading via 
browser.

Second, this is 100% OS charset issue, that's for sure. I have tested with 
several Linux and Mac OS X systems which have UTF-8 filesystem, and there is 
absolutely no problems if I have files with finnish, japanese, chinese, korean, 
you name it. I can upload you screenshot if you don't believe, you can even 
post me chars you want to see there.

But at the same time, I agree the CONVERSION might not work as it should. Point 
is, the systems _that don't require conversion_, work as should. But since your 
filesystem is not UTF8, it requires conversion. However, I never had any server 
that did not operate with UTF-8, so current Mollify conversion mechanism is 
totally based on a theory.

I'm pretty sure I know why you did not get unknown errors with older version, 
it's the way how filesystem items are identified in the latest version. All 
items now get unique id that is stored in db, and this does not tolerate 
charset mismatch, db uses utf8 as well (at least it should). The bottom line 
is, you said you had conversion problems with older version as well ("õ" 
translated to a weird symbol). This is the problem you have to solve, it does 
not matter if you got less error messages with older version, the errors just 
were not that imminent.

I wrote a test script (attached), it's totally separate from Mollify so you can 
copy it anywhere on your server. When you open it, enter a path to the folder 
where you have the files with these problematic letters. Take a screenshot and 
send it to me (yes a picture, because I wan't to see it how it shows in your 
screen).

This script simply reads the filesystem, and tries to detect each files charset 
(for the record, mine says here UTF-8). Column "trial" shows how PHP guesses it 
should encode UTF8 (same as Mollify conversion setting "TRUE"). If you do enter 
charset into the second input box, you'll see how the encoding goes (same as 
Mollify conversion setting with charset code).

Original comment by samuli.j...@gmail.com on 26 Dec 2011 at 9:15

Attachments:

GoogleCodeExporter commented 9 years ago
Alright, i used your script and attached a screenshot, here you can see whats 
going on, hopefully it helps.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 9:37

Attachments:

GoogleCodeExporter commented 9 years ago
Which lines are not working with mollify? Charset detection does not seem to 
work since it shows you utf8 as well.

But one thought. As I said, id's are stored in db. You should empty table 
"item_id" (or at least remove rows that are not converted properly) whenever 
you change the conversion setting.

Original comment by samuli.j...@gmail.com on 26 Dec 2011 at 10:15

GoogleCodeExporter commented 9 years ago
The lines not working are the lines nr 1, 4 and 5, so the folder names: 
"jõulud", "not nõrmal" and "oaou õäöü".

About the conversion setting, i have tried so many charsets, it would be quite 
a hassle to delete and add items. Also i don't see how emptying the DB would 
help with the conversion settings, since every time i changed the conversion, i 
took a new path under my user and added a differently named folder with one of 
the "bad" letters in it.

I'm not really sure it would help if i try ALL of those charsets again and this 
time i would need to empty the DB as well.

Original comment by Logic...@gmail.com on 26 Dec 2011 at 11:10

GoogleCodeExporter commented 9 years ago
The thing here why i think it is still Mollify's issue is that INSIDE the 
database, it adds the folder named "jõulud" correctly with the right 
characters, but under the filesystem, it creates it as "jƵulud". So Mollify 
is actually creating the folder. under OS, i can literally create the folder 
"jõulud" correctly without problems.

Perhaps you could make a workaround on this problem under Mollify somehow?

Original comment by Logic...@gmail.com on 26 Dec 2011 at 11:17

GoogleCodeExporter commented 9 years ago
I don't know how many times I need to say this, it is _conversion issue_, your 
filesystem does not understand utf8 and that's why chars corrupt. I really 
would make a fix, but it's impossible when I don't know what should be done 
differently, this is what I'm trying to figure out here. Like I said, 
everything works when you have everything in utf8, and you don't. However, I 
don't have such environment for me to test in so this is like poking in the 
dark. I hope you finally understand this.

You say Mollify creates the folder "jƵulud"? Like I said earlier, first 
thing to do is to make _listing_ work, don't upload or create new things from 
Mollify until that works.

So remove any corrupted characters directly in the filesystem, and then take 
screenshots from a) filesystem (ie Windows) and b) Mollify, both from same 
folder. I just want to see how different they look like.

Original comment by samuli.j...@gmail.com on 27 Dec 2011 at 6:54

GoogleCodeExporter commented 9 years ago
To rephrase, I'd like to know that if you create file with problem letters in 
filesystem directly, how does it look like in Mollify, and does changing the 
conversion setting change this in any way.

Also, could you tell me what kind of system you have (OS, version, web server 
IIS or Apache etc)? I'm installing Windows virtual machine to see if this is 
universal windows issue.

Original comment by samuli.j...@gmail.com on 27 Dec 2011 at 7:53

GoogleCodeExporter commented 9 years ago
It seems all windows systems suffer from this. I managed to reproduce similar 
issues, and found some few things. One very odd thing is that windows MySQL 
does not use UTF8 even when the database is UTF8.

So try this:
1) Copy attached file over your version (located in "/include/filesystem")
2) add conversion charset to "cp1252" or "cp1251" (whatever your charset is, my 
windows had "cp1252")
3) add following line into your configuration.php where you have other db 
variables:
$DB_CHARSET = "utf8";

For me this helped to list correct chars, and do some basic file operations. I 
think there may be some places that need more attention, but let's see if this 
makes any difference.

Original comment by samuli.j...@gmail.com on 27 Dec 2011 at 9:45

Attachments:

GoogleCodeExporter commented 9 years ago
Alright, the folders i make are shown exactly in the script you gave me before. 
I can make folder fine when they don't contain those "bad" characters via 
mollify or create them directly under the filesystem, they show and work fine 
both ways.

If i make a folder named "jõulud", using mollify, then under mollify it shows 
"jõulud" but under filesystem it's shown as "jƵulud". (i can't use the 
folder in most cases, unknown error occure)
If i make the folder "jõulud" directly under filesystem then mollify also 
shows it correctly ("jõulud"), but i can't use the folder in most cases, i.e 
copying files in it, trying to delete the folder etc.

"So remove any corrupted characters directly in the filesystem, and then take 
screenshots from a) filesystem (ie Windows) and b) Mollify, both from same 
folder. I just want to see how different they look like."
RE: There is no point to do that, because if i test it with the folder named 
"jõulud" and since it created the folder "jƵulud" under filesystem and if i 
remove the "Ƶ" it is then named "julud" and everything works fine. If i 
don't remove those bad letters, it does not work anymore.

"To rephrase, I'd like to know that if you create file with problem letters in 
filesystem directly, how does it look like in Mollify, and does changing the 
conversion setting change this in any way."
RE: As explained above, under mollify, it then shows the correct name, but the 
folder would be "bugged", getting unknown errors etc.

I'm using Windows Vista Business Edition, English is the default language it 
came with, Using Apache for windows (latest version), MySQL (latest version), 
PHP (latest version) and Mollify script (latest version).

I used your method you described and used the custom 
"LocalFilesystem.class.php". Now if i make folder "jõulud" under mollify, it 
shows the folder "jƵulud" under mollify and it creates the same folder 
"jƵulud" under the filesystem. File operations seem to be working as well, i 
can copy, delete and move files normally so far. Also deleting the folder is 
now possible. I used " "convert_filenames" => "CP1252", ".

Conclusion: The custom php file and your guidelines helped to make the mollify 
file operations work, it does not show the correct characters though.

Hope this helps.

Original comment by Logic...@gmail.com on 27 Dec 2011 at 1:35

GoogleCodeExporter commented 9 years ago
Little more progress. The conversion is a mess, I wish all OSes operated on 
utf8 so this wouldn't need to be done, but I guess MS has to do everything 
little bit more complicated.

With this version, I was able to create new folders with any letters, copy, 
rename, move and upload.

You should clean up the filesystem to correct any corrupt chars, and empty the 
item_id table, it is filled with incorrect paths.

Original comment by samuli.j...@gmail.com on 27 Dec 2011 at 4:22

Attachments:

GoogleCodeExporter commented 9 years ago
Yes, it seems that it is working now. The correct folder name appears in the 
filesystem as well when created via mollify. File actions also seem to be 
working fine.

Although now i can't upload files which have symbols like "õ" in them, i get 
error -300. Before it was working when i had " "convert_filenames" => FALSE, ", 
but now even if i set it like that, it's not working anymore. Before it removed 
the characters, but not now, just the error. Can this be fixed as well?

Another question, when Mollify get's an update, do i need to configure 
something else as well, get the custom class.php or does the new update include 
all the necessary settings?

Also wouldn't it be better if Mollify deletes invalid or not present folders 
under the database itself? Otherwise, if many users, the database gets a bit 
messy.

Original comment by Logic...@gmail.com on 27 Dec 2011 at 5:11

GoogleCodeExporter commented 9 years ago
I was testing with basic uploader, this problem sounds like Plupload. I'll 
check this at some point as well.

I'm gonna include all these in next update, so others with non-UTF8 can also 
use it.

Mollify deletes all ids when the items are removed via Mollify. I've considered 
cleanup for items that are removed outside Mollify, but haven't yet decided how 
to do it. After all, it requires scanning all folders recursively, not 
something I can do too often.

Original comment by samuli.j...@gmail.com on 27 Dec 2011 at 5:26

GoogleCodeExporter commented 9 years ago
Alright, thats good news about the update.

About the Mollify deleting items from database. When i create a folder 
somewhere under my user and if i delete it, it's gone both in filesystem and in 
mollify, but in the item_id, inside the database, the folder is still there and 
is not deleted.

Original comment by Logic...@gmail.com on 27 Dec 2011 at 5:37

GoogleCodeExporter commented 9 years ago
Another headache from MS. Since windows uses different directory separator than 
Unix and Mac OS X (\ vs /), all the regex queries are broken (since that slash 
has special meaning in the regex, it needs to be escaped). That's why deleting 
the items from db don't work, and who knows what else. Working on that as 
well...

But I couldn't reproduce any upload errors with plupload. Could you get debug 
log for that upload to see what's wrong. BTW, I recommend changing 
PluploadHandler.class.php line 41 into following:

$fileName = preg_replace('/[\x47\x92\x34\x39]+/', '', $fileName);

The original Plupload cleanup code was too strict, this strips out only slashes 
and quotes.

Original comment by samuli.j...@gmail.com on 27 Dec 2011 at 8:24

GoogleCodeExporter commented 9 years ago
Ok, at first i couldn't get the debug logs from anywhere, which is weird (yes i 
enabled the debugging from mollify etc), but nevertheless, i added the line you 
recommended to plupload and everything started to work fine, i can now upload 
those "bad letter" files without problems and i can also operate with them. So 
that's a problem solved as far as it looks like.

You told that "...That's why deleting the items from db don't work, and who 
knows what else." Well I'm kind of curious about the "what else" part, since my 
mollify setup takes quite a while to load the popup menus and folders etc 
(sometimes even seconds). It's not the bandwidth issue, I'm fairly sure (tried 
it on localhost as well). Could this be related with the database thing? Since 
on the demo site, everything works and loads super fast.

Hope you get the database issue fixed though, wish you all the best in your 
work and thanks for all the help!

Original comment by Logic...@gmail.com on 27 Dec 2011 at 8:47

GoogleCodeExporter commented 9 years ago
There's new version 1.8.6 in the download section (not yet updated to released 
version)

Original comment by samuli.j...@gmail.com on 4 Jan 2012 at 6:45

GoogleCodeExporter commented 9 years ago
I updated my current server with the new 1.8.6 update.
So far it's working great, everything is much more faster and few other things 
were actually not working, as intended before, but now i see how they should 
work.
I can create folders with unicode letters, they are working/showing fine so far 
under filesystem and in mollify. The file uploading is now working fine even 
with those "bad letters" (nothing is removed from the file name when uploading).

I used the "/backend/update" but it said no update needed, so i guess from the 
database point of view, everything is still fine?

Anyway i will report any bugs i find here. And really, great job with the fast 
new update!

Original comment by Logic...@gmail.com on 4 Jan 2012 at 4:15

GoogleCodeExporter commented 9 years ago
Great to hear!

Original comment by samuli.j...@gmail.com on 5 Jan 2012 at 8:41

GoogleCodeExporter commented 9 years ago
Ok, i think i found something.
It seems if people upload, a picture file for example, which is named 
"picture.PNG" (this is an example, it could be also JPG or whatever) then when 
it arrives to the server or some time in between, it is named to "picture.PN".
If people upload them in small letters "picture.png" then it arrives normally 
and no extension change is there, when it's in capital letters, one letter 
seems to get lost from the file extension.

Original comment by Logic...@gmail.com on 5 Jan 2012 at 9:17

GoogleCodeExporter commented 9 years ago
Yes, this is quite funny mistake. It's again about the plupload filename 
cleaning regex, the format used had character hex codes when I meant octal 
codes, and the codes given had totally different meaning (and it did in fact 
remove capital G for example).

The more correct version is PluploadHandler.class.php line 42:

$fileName = preg_replace("/[\47\92\34\39]+/", "", $fileName);

Original comment by samuli.j...@gmail.com on 6 Jan 2012 at 4:55