Open GoogleCodeExporter opened 9 years ago
Hmm, might be an encoding problem.
Since the encoding has been correctly defined in the mysqlimport command, there
could still a problem somewhere else.
If utf8 is not the standard encoding on you system, you might have to run the
DataMachine with the -Dfile.encoding=utf8 parameter. (also see:
http://code.google.com/p/jwpl/wiki/DataMachine)
Also, you should check if you have created the database using the command
CREATE DATABASE [DB_NAME] DEFAULT CHARACTER SET utf8 DEFAULT COLLATE
utf8_general_ci;
If this does not help, please extract the lines from the data file which cause
these warnings and post them here.
Original comment by oliver.ferschke
on 16 May 2012 at 10:16
thanks, but I have set character encoding specifically already, and can confirm
that I used the "create" statement as you said.
I extracted the line that caused the error (Error 1406... above) and attached
as a screenshot. It is extremely long, since each line in the "page.txt" stores
a single wikipedia article. The screenshot is about the tail of the first line,
and I have highlighted the boundary with second line with red color.
Im not sure how useful this is, since the last field "isDismabiguation" is a
"bit" datatype, and it doesnt seem to show properly, as you can see.
Original comment by ziqizhan...@googlemail.com
on 16 May 2012 at 10:55
Attachments:
Original issue reported on code.google.com by
ziqizhan...@googlemail.com
on 15 May 2012 at 3:47