Open phinjensen opened 6 years ago
Hello, I have a database wherein all char columns are UTF-8, for exmaple this (used later in the Perl codes as well):
printf "select d02name::bytea from d02ben where d02bnr = '00001048313' ;\n" | psql -Usisis -dpos71 -t
\x50c3a46461676f67697363686520486f6368736368756c65205765696e67617274656e2020202020
^^^^ German Umlaut ä in UTF-8
The used Perl code is:
#!/usr/local/bin/perl
my $PGDB = 'dbi:Pg:dbname=pos71;host=127.0.0.1';
my $PGDB_USER = 'sisis';
my $PGDB_PASS = 'sisis123';
use DBD::Pg;
sub onevaluesql($) {
my ($command) = @_;
my $sth=$dbh->prepare($command);
$sth->execute();
my ($value) = $sth->fetchrow_array();
$sth->finish();
return $value;
}
$dbh = DBI->connect($PGDB, $PGDB_USER, $PGDB_PASS,
{ pg_utf8_strings => 1,
AutoCommit => 0,
RaiseError => 0,
PrintError => 0,
}
)
|| die "Couldn't connect to $PGDB as user $PGDB_USER: $DBI::errstr\n";
$dbh->do("SET client_encoding TO UTF8");
print "DBI is version $DBI::VERSION, DBD::Pg is version $DBD::Pg::VERSION\n";
print "client_encoding=" . onevaluesql("show client_encoding;") . ", server_encoding=" . onevaluesql("show server_encoding;") . "\n";
$sth=$dbh->prepare(
"select d02name from d02ben where d02bnr = '00001048313'")
or die "parse error\n".${DBI::errstr}."\n";
$sth->execute
or die "exec error\n".${DBI::errstr}."\n";
while ( (my @row = $sth->fetchrow_array) ) {
my $HexStr = unpack("H*", $row[0]);
print "HexStr: " . $HexStr . "\n";
print "$row[0]\n";
}
$sth->finish;
And when I run this, the above column comes out coded in ISO-8859-1, note the hex string 50e4... :
./utf8-01.pl
DBI is version 1.642, DBD::Pg is version 3.8.0
client_encoding=UTF8, server_encoding=UTF8
HexStr: 50e46461676f67697363686520486f6368736368756c65205765696e67617274656e2020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020
Pagogische Hochschule Weingarten
I watched with the Linux strace the communicatio between the client (DBD::Pg) and the PostgreSQL server 11.4, and the server delivers the string in question as UTF-8 to the client:
...
recvfrom(3, "T\0\0\0 \0\1d02name\0\0\1\313\237\0\3\0\0\4\22\377\377\0\0\0|\0\0D\0\0\0\203\0\1\0\0\0yP\303\244dagogische Hochschule Weingarten C\0\0\0\rSELECT 1\0Z\0\0\0\5T", 16384, 0, NULL, NULL) = 185
write(1, "HexStr: 50e46461676f67697363686520486f6368736368756c65205765696e67617274656e2020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020\n", 249) = 249
write(1, "P\344dagogische Hochschule Weingarten \n", 121) = 121
What I do wrong? Thanks.
Comments for https://www.endpointdev.com/blog/2011/06/dbdpg-utf-8-for-postgresql/ By Greg Sabino Mullane
To enter a comment: