Closed GoogleCodeExporter closed 9 years ago
The MySQL layer hasn't changed for UTF-8 treatment for quite some time; all
string fields are created with CHARACTER SET utf8
If you are able, please use a browser (the one on the phone should work) and
browse to ODK Aggregate's data upload page (in RC1, it is on the FormsList
subtab; in RC2, it is on the SubmissionAdmin subtab) to upload a submission.
This would help identify whether it is a regression in ODK Collect 1.1.7 (are
you using RC2 or RC1) or a regression in Aggregate.
Original comment by mitchellsundt@gmail.com
on 6 Oct 2011 at 6:15
Original comment by mitchellsundt@gmail.com
on 6 Oct 2011 at 6:16
I confirmed that uploading using the website's upload seems to preserve UTF-8
characters. Looks like an ODK Collect regression.
Also, when I tried to upload a file to my ODK Aggregate, I got this exception
and a completely incorrect error code (500) reported back to the user:
/System.err( 2319): java.lang.NullPointerException
/System.err( 2319): at
org.odk.collect.android.tasks.InstanceUploaderTask.doInBackground(InstanceUpload
erTask.java:212)
/System.err( 2319): at
org.odk.collect.android.tasks.InstanceUploaderTask.doInBackground(InstanceUpload
erTask.java:1)
/System.err( 2319): at android.os.AsyncTask$2.call(AsyncTask.java:185)
/System.err( 2319): at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:306)
Original comment by mitchellsundt@gmail.com
on 6 Oct 2011 at 6:55
Adding Aggregate tag, as it is still unclear how to reproduce the actual issue.
Upload to opendatakit.appspot.com works.
Original comment by mitchellsundt@gmail.com
on 6 Oct 2011 at 6:58
Alex -- please also confirm that the file saved on the Android shows UTF-8
characters when you view it in a UTF-8 savvy editor.
Original comment by mitchellsundt@gmail.com
on 6 Oct 2011 at 6:59
Hi,
Just been having another look at this. I created fresh installs of Aggregate
RC2 on both appspot and a tomcat/mysql version.
When I submit a form with either Amharic or Cyrillic data in the form fields,
on the appspot version these display fine when I view in ODK Aggregate. However
with the tomcat/mysql version I still only see the question marks.
I made a dump of the mysql database and this does show the the fields/tables
etc are created with utf8 character encoding.
But it seems that it's not an issue with ODKCollect - as I was using the same
install of ODKCollect to submit to each of the Aggregate instances.
Hope that helps you recreate the issue. Please let me know if there is
something else I should check.
Cheers,
Alex
Original comment by AlextLit...@gmail.com
on 10 Oct 2011 at 2:39
I also tried submitting the form through a manual submission... see the
attached form and 2 submission files (one Amharic and one Cyrillic)... on my
appspot these both upload fine, but I still get the question marks with the
mysql/tomcat version.
Cheers,
Alex
Original comment by AlextLit...@gmail.com
on 10 Oct 2011 at 2:46
Attachments:
Original comment by carlhart...@gmail.com
on 10 Oct 2011 at 4:01
I don't have trouble displaying these on a local instance of RC2 with MySQL
(I'm seeing somewhat cyrillic text in one post and a username beginning with an
upward-trending w in a script-like font in the second.
Can you try other browsers (Firefox, Safari, IE) and see if it occurs on only
some browsers?
If this does occur on Firefox, can you install the HttpFox add-in, start it,
and browse to the Submissions page.
Then look at the .../aggregateui/submissionservice POST request displayed in
HttpFox, verify that it has a header for Content-Type: text/x-gwt-rpc;
charset=utf-8
and then verify that the POST Data response type is text/x-gwt-rpc;
charset=utf-8
and then verify that the Content shows the Amharic characters (or not). At
least one of these should be wrong.
Original comment by mitchellsundt@gmail.com
on 10 Oct 2011 at 5:53
Also, with your second update, you said you looked inside the MySQL database
table and saw it was created with UTF-8 character set. Does that mean you saw
the Amharic characters in that table, or is it still an issue of the upload not
inserting the Amharic into the database.
Also please let me know what browser version you're using; I'm using Firefox
5.0.
Original comment by mitchellsundt@gmail.com
on 10 Oct 2011 at 6:17
Thanks for all your help, think I've got this sorted out now. The reason was
due to the my.cnf not having the correct settings. I needed to add the
following to /etc/mysql/my.cnf (under the [mysqld] section):
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci
and under [mysql]:
default-character-set=utf8
After I'd added these settings and restarted mysql the submissions are now
displaying the Amharic and Cyrillic characters fine.
For info I'm running this on the default install of mysql on Ubuntu 10.10 so
not quite sure why mysql on this OS doesn't have utf8 as the default. Anyway
thanks for the help and comments - hope this message will help anyone else
having similar issues.
The issue isn't an ODK Aggregate or Collect error, so can be closed now.
Cheers,
Alex
Original comment by a...@alexlittle.net
on 10 Oct 2011 at 10:11
Should document this as a potential issue on Linux deployments.
Original comment by mitchellsundt@gmail.com
on 19 Oct 2011 at 10:20
Original comment by mitchellsundt@gmail.com
on 19 Oct 2011 at 10:31
Updated Tomcat Install documentation to include mention of setting the default
character set and collation of the database for UTF-8.
Original comment by mitchellsundt@gmail.com
on 19 Apr 2012 at 10:55
Original issue reported on code.google.com by
a...@alexlittle.net
on 5 Oct 2011 at 5:24