google-code-export / ords

Automatically exported from code.google.com/p/ords
1 stars 0 forks source link

Encoding of Greek Characters #617

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
With reference to this RT ticket[1], it seems that we've got an encoding
issue with ORDS.  I tried uploading the file containing the greek
characters (once fixing the CSV delimiters) to a database in my local
development environment, and found that the characters displayed
correctly.  However, I then repeated the process on dev.ords, and found
that they didn't, see the attached screenshot for comparison.

My immediate thought is that the JVM or Tomcat (or even Postgres?) on
the sysdev server might be configured to use the wrong encoding.  Could
you investigate and see if you can find the root of the problem?  I've
also attached the fixed CSV file in case it's useful.

Cheers
Mark

[1] https://rt.oucs.ox.ac.uk/Ticket/Display.html?id=2669952

Original issue reported on code.google.com by thest...@gmail.com on 10 Feb 2015 at 9:11

GoogleCodeExporter commented 9 years ago

Original comment by thest...@gmail.com on 10 Feb 2015 at 9:57

GoogleCodeExporter commented 9 years ago

Original comment by jajwil...@gmail.com on 16 Feb 2015 at 10:29

GoogleCodeExporter commented 9 years ago
This was the correspondence prior to the issue being raised...

Hi

Kristian assigned this to me since he doesn't think it is a configuration 
issue. Are you ok if we raise it on Google Code - I plan to work on ORDS issues 
again soon and that is where I look for them ;-)
________________________________________
From: Kristian Kocher [kristian.kocher@it.ox.ac.uk]
Sent: 09 February 2015 14:42
To: David Paine
Subject: Fwd: Non-latin character encoding

-------- Forwarded Message --------
Subject:        Non-latin character encoding
Date:   Wed, 04 Feb 2015 10:42:46 +0000
From:   Mark Johnson <mark.johnson@it.ox.ac.uk><mailto:mark.johnson@it.ox.ac.uk>
To:     Kristian Kocher 
<kristian.kocher@it.ox.ac.uk><mailto:kristian.kocher@it.ox.ac.uk>
CC:     James Wilson <james.wilson@it.ox.ac.uk><mailto:james.wilson@it.ox.ac.uk>

Hi Kristian,
With reference to this RT ticket[1], it seems that we've got an encoding
issue with ORDS.  I tried uploading the file containing the greek
characters (once fixing the CSV delimiters) to a database in my local
development environment, and found that the characters displayed
correctly.  However, I then repeated the process on dev.ords, and found
that they didn't, see the attached screenshot for comparison.

My immediate thought is that the JVM or Tomcat (or even Postgres?) on
the sysdev server might be configured to use the wrong encoding.  Could
you investigate and see if you can find the root of the problem?  I've
also attached the fixed CSV file in case it's useful.

Cheers
Mark

[1] https://rt.oucs.ox.ac.uk/Ticket/Display.html?id=2669952
--
Mark Johnson
Development Manager
OSS Watch http://oss-watch.ac.uk

Original comment by jajwil...@gmail.com on 17 Feb 2015 at 11:10

GoogleCodeExporter commented 9 years ago
I was unable to reproduce on my dev system and Mark was also unable to 
reproduce on his. I also tried uploading a file to Mark's dev system and still 
cold not reproduce the problem. However, it was present on Dev suggesting a 
configuration problem. I have now fixed this problem in code. However, I have a 
feeling this is a can of worms waiting to spill out. 

Please find some Chinese CSV data to test with. It may be we shall need to 
offer the user a means of specifying their own encoding.

Original comment by thest...@gmail.com on 23 Feb 2015 at 11:30

GoogleCodeExporter commented 9 years ago
No Chinese to hand, but definitely now seems to be working with other unicode 
characters (such as em-dashes) that weren't working before. Will keep an eye 
out for more interesting test cases.

Original comment by jajwil...@gmail.com on 23 Feb 2015 at 2:02