hoangduit / openmeetings

Automatically exported from code.google.com/p/openmeetings
0 stars 0 forks source link

Titles of downloaded documents: incorrect national characters #728

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When downloading a document from a server, there's a mistake in filename of 
national characters. The international characters are replaced with 
underscore characters. E.g., the problem is reproduced with Russian 
characters.

Which version of OpenMeetings are you running?
0.7rc2 and later

Original issue reported on code.google.com by e.rovin...@gmail.com on 12 May 2009 at 1:03

GoogleCodeExporter commented 9 years ago
Hello!

I made a patch for international symbols in the titles of documents.

One of the difficulties in creating a patch was the OEM(ibm866 in my case) 
encoding
used in batch files for national symbols in Windows. Linux uses the same 
encoding for
both the console and the graphical user interface. I've made a batch file 
encoding
setting in the database. Setting is stored in table "Configuration" and is 
called
"batch.encoding". If the setting is not set,  the default value of java system
property "file.encoding" is used. 

When writing the patch, I encountered a problem - in file library.xml header is 
set
to UTF-8, but attribute values are stored in iso8859-1. So I had to replace the 
byte
streams to character streams. Also, I replaced the value of the iso8859-1 to 
utf-8 in
web.xml file . 

The biggest surprise for me was the incorrect interpretation of the URL 
parameters in
HttpServletRequest class. Therefore, I deal with the values of input parameters.

Original comment by CTpaH...@gmail.com on 12 May 2009 at 1:15

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for the nice patch! I have few questions:

1.
String batchEncoding =
Configurationmanagement.getInstance().getConfKey(3,"batch_encoding").getConf_val
ue();
if (batchEncoding.equals("")) {
  batchEncoding = System.getProperty("file.encoding");
}

I see this logic is repeated several times. Could it be fixed on the
Configurationmanagement level? For example, by using something like
Configurationmanagement.getInstance().getBatchEncoding()

2. Could you please use java coding conventions in the new code?
-               Document document = reader.read(filePath);
+               Document document = reader.read( new FileInputStream(filePath) 
);

http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html

Ctrl+F in Eclipse formats the code automatically.

3. 
The following receipt seems easier way to fix setCharacterEncoding("UTF-8")
http://threebit.net/mail-archive/tomcat-users/msg01985.html
Does it work?

Original comment by alexei.f...@gmail.com on 12 May 2009 at 6:10

GoogleCodeExporter commented 9 years ago
1. Calling this function is found in many places. And I am not able to change 
the
logic function because it can damage the other code. Perhaps I will do it 
another
time in the work to improve the quality of the code.

2. Thank you. But I do not use Eclipse. In the near future I will try code the 
magic
Ctrl-F combination in my Vim.

3. I carefully read your link and consistent that I have to change the default
properties useBodyEncodingForURI. I do not understand, where is this property?

I made some changes and made a new patch

Original comment by CTpaH...@gmail.com on 15 May 2009 at 3:24

Attachments:

GoogleCodeExporter commented 9 years ago
Added a patch beautification (in process).

Original comment by alexei.f...@gmail.com on 15 May 2009 at 7:55

Attachments:

GoogleCodeExporter commented 9 years ago
I got another chance understanding the patch. I've noticed that the file 
UploadHandler.java has the only fix:

 ServletMultipartRequest upload = new ServletMultipartRequest(httpServletRequest,
     104857600, "utf-8"); // max 100 mb
 InputStream is = upload.getFileContents("Filedata");

 //trim whitespace
-String fileSystemName = 
StringUtils.deleteWhitespace(upload.getFileSystemName("Filedata"));
+String fileSystemName = upload.getFileSystemName("Filedata");
+fileSystemName = new String(fileSystemName.getBytes("ISO-8859-1"), "UTF-8");
+StringUtils.deleteWhitespace(fileSystemName);

Isn't it better to specify a proper encoding in the constructor of 
ServletMultipartRequest? The usage without encoding is deprecated, and we got a 
proper file name automatically. Here is a code snippet which works for me:

ServletMultipartRequest upload = new ServletMultipartRequest(httpServletRequest,
    104857600, "utf-8"); // max 100 mb
InputStream is = upload.getFileContents("Filedata");

String fileSystemName = upload.getBaseFilename("Filedata");

Thoughts?

As for the changes of DownloadHandler, could you please suggest a test case 
when it 
comes to UTF-8 encoded URI, so you need to create your own parameter parser? 
For 
cases I've tried parentPath is empty.

Original comment by alexei.f...@gmail.com on 26 May 2009 at 6:26

GoogleCodeExporter commented 9 years ago
>> Isn't it better to specify a proper encoding in the constructor of 
ServletMultipartRequest? The usage without encoding is deprecated, and we got a 
proper file name automatically.
Yes. Indeed, it looks better.

>> As for the changes of DownloadHandler, could you please suggest a test case 
when 
it comes to UTF-8 encoded URI, so you need to create your own parameter parser? 
For 
cases I've tried parentPath is empty.

Try to click "Refresh" or click document page to download it on the whiteboard. 
It 
seems to me that is exactly what should be.
--
WBR,
CTpaHHoe

Original comment by CTpaH...@gmail.com on 26 May 2009 at 7:07

GoogleCodeExporter commented 9 years ago
Hi!

I solved the problem more clear. Fix the code of class HttpServletRequest. To 
use my code, I 
made a new filter. To enable the functionality of my code you can add a 
description of the filter 
in web.xml. 

I attach a patch for review. If it proves useful, then I modify it for the 
current revision.

Original comment by CTpaH...@gmail.com on 29 May 2009 at 12:34

Attachments:

GoogleCodeExporter commented 9 years ago
It seems the old patch was attached: it still used HashMap<String, String>
requestParams. 

Original comment by alexei.f...@gmail.com on 29 May 2009 at 12:46

GoogleCodeExporter commented 9 years ago
I'm sorry, I attached the wrong files. Now correctly. 

Original comment by CTpaH...@gmail.com on 29 May 2009 at 12:53

Attachments:

GoogleCodeExporter commented 9 years ago
I have updated the patch to the current revision (rev2045). The names of the 
files were 
renamed to more appropriate.

Original comment by CTpaH...@gmail.com on 3 Jun 2009 at 10:23

Attachments:

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Thanks to my Apache friend, I found an utility class to parse encoded URIs. It 
is
org.apache.http.client.utils.URLEncodedUtils from HttpComponents (we already 
have
them as a part of Axis).

Original comment by alexei.f...@gmail.com on 5 Jun 2009 at 3:57

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I cannot understand that. CT., could you please make a small write up 
describing how
file names travel between a client and a server? 

If we managed to fix .doc and .ppt uploads, this would be enough for this 
release.

Original comment by alexei.f...@gmail.com on 5 Jun 2009 at 7:17

GoogleCodeExporter commented 9 years ago
>If we managed to fix .doc and .ppt uploads, this would be enough for this 
release.

The .xls files would also be fine, if it wouldn't the job too hard.

Original comment by e.rovin...@gmail.com on 6 Jun 2009 at 7:28

GoogleCodeExporter commented 9 years ago
Issue 726 has been merged into this issue.

Original comment by alexei.f...@gmail.com on 9 Jun 2009 at 4:36

GoogleCodeExporter commented 9 years ago
What is the status from te point of view rc3? have any part of the fix been 
included 
into rc3? 
I see no changes on rc3 even with uploaded file names.

Original comment by e.rovin...@gmail.com on 9 Jun 2009 at 7:57

GoogleCodeExporter commented 9 years ago
The fix does not work for me for both Chrome and IE and I fail to debug the 
problem
with my configuration.

The overall thing CT. discovered is that the behavior of browsers differ when 
they
send national characters to a server. If we couldn't overcome that easily, we 
should
recall Seabastian's proposal of using ID's instead of file names.

Original comment by alexei.f...@gmail.com on 9 Jun 2009 at 8:18

GoogleCodeExporter commented 9 years ago
Thank you, Alexei.
I believe we should target the fix into rc4 or later.

Original comment by e.rovin...@gmail.com on 9 Jun 2009 at 8:34

GoogleCodeExporter commented 9 years ago
r2175 :
I've done some work on the patch given by CTPaHHoe, because it was not working 
right 
with the installer script and was not applicable to the current code version, 
as well 
as tested it with different browsers - current patch version works with Opera, 
Firefox, 
IE and Chrome browsers. Files with localized names are properly uploaded, shown 
and 
downloaded now, please verify and check how is it working with other browsers.

Original comment by volkov.r...@gmail.com on 16 Jul 2009 at 10:19

GoogleCodeExporter commented 9 years ago
Works fine for Opera 9.27, IE6, Chrome 2.
Thank you!

Original comment by e.rovin...@gmail.com on 16 Jul 2009 at 1:48

GoogleCodeExporter commented 9 years ago
This problem persists.
I obseve it with 0.9rc2 (r2227)

When I upload a file with russian tittle, I see perfect preview, but when I try 
to 
upload it to the whtbrd, I see "DELETED" instead of the file.

In the log I see "?" symbols that substitute Russian symbols:

DEBUG 08-27 19:48:45.114 WhiteBoardService.java 3904219 289 
org.openmeetings.app.remote.WhiteBoardSe<<rvice [pool-4-thread-16] - sending 
:org.openmeetings.app.hibernate.beans.recording.RoomClient@10a4<<6e5
DEBUG 08-27 19:48:45.114 WhiteBoardService.java 3904219 291 
org.openmeetings.app.remote.WhiteBoardSe<<rvice [pool-4-thread-16] - 
sendObjectSyncFlag 
:org.openmeetings.app.hibernate.beans.recording.Room<<Client@10a46e5
DEBUG 08-27 19:48:45.380 DownloadHandler.java 3904485 44 
org.openmeetings.servlet.outputhandler.Down<<loadHandler [http-5080-2] -
query = 
fileName=îò÷åò_Ðîâèíñêèé3105.swf&moduleName=videoconf1&parentPath=
/
îò÷åò_Ðîâèíñêèé3105_1&roo<<m_id=1&sid=bcce07322fa8fcf36e96b2108ca9
daf7
DEBUG 08-27 19:48:45.380 DownloadHandler.java 3904485 45 
org.openmeetings.servlet.outputhandler.Down<<loadHandler [http-5080-2] -

fileName = �����_���������3105.swf
DEBUG 08-27 19:48:45.381 DownloadHandler.java 3904486 46 
org.openmeetings.servlet.outputhandler.Down<<loadHandler [http-5080-2] -

parentPath = /�����_���������3105_1
DEBUG 08-27 19:48:45.381 DownloadHandler.java 3904486 58 
org.openmeetings.servlet.outputhandler.Down<<loadHandler [http-5080-2] - sid: 
bcce07322fa8fcf36e96b2108ca9daf7
DEBUG 08-27 19:48:45.384 Sessionmanagement.java 3904489 185 
org.openmeetings.app.data.basic.Sessionm<<anagement [http-5080-2] - 
checkSession 
USER_ID: 1
DEBUG 08-27 19:48:45.395 DownloadHandler.java 3904500 212 
org.openmeetings.servlet.outputhandler.Dow<<nloadHandler [http-5080-2] - 
requestedFile: �����_���������3105.swf current_dir: 
/home/video/apps/<<openmeetings-r2227/upload/1//�����_������
�����3105_1/
DEBUG 08-27 19:48:45.396 DownloadHandler.java 3904501 222 
org.openmeetings.servlet.outputhandler.Dow<<nloadHandler [http-5080-2] - LOG 
DownloadHandler: The request file is not readable
DEBUG 08-27 19:48:45.396 DownloadHandler.java 3904501 226 
org.openmeetings.servlet.outputhandler.Dow<<nloadHandler [http-5080-2] - LOG 
ERROR 
requestedFile: �����_���������3105.swf

Original comment by e.rovin...@gmail.com on 27 Aug 2009 at 3:54

GoogleCodeExporter commented 9 years ago
just a little modification, but can you retest with the attached one?

Original comment by seba.wag...@gmail.com on 27 Aug 2009 at 4:07

Attachments:

GoogleCodeExporter commented 9 years ago
Ok, I'll make a try

Original comment by e.rovin...@gmail.com on 27 Aug 2009 at 4:10

GoogleCodeExporter commented 9 years ago
No, it was not of help. (Deleting the cache was also useless).

Original comment by e.rovin...@gmail.com on 27 Aug 2009 at 4:41