Closed p5pRT closed 17 years ago
perldoc perluniintro currently says: · How Do I Detect Data That’s Not Valid In a Particular Encoding?
Use the "Encode" package to try converting it. For example\,
use Encode ’decode_utf8’; if (decode_utf8($string_of_bytes_that_I_think_is_utf8)) { # valid } else { # invalid }
Which does not match my tests or the Encode documentation which states that malformed characters are replaced with a substitution character; it does not return true or false.
% perl -e '$n="\x{c5}";use Encode;print decode_utf8($n)?"valid":"invalid";' valid
So you need to use a CHECK function other than the default.
% perl -e '$n="\x{c5}";use Encode;eval {decode_utf8($n\, Encode::FB_CROAK)}; print $@?"invalid":"valid";' invalid % perl -e '$n="\x{c3}\x{85}";use Encode;eval {decode_utf8($n\, Encode::FB_CROAK)}; print $@?"invalid":"valid";' valid
I don't think it is relevant for a documentation bug. :) But just in case\, here is my perldebug -d output:
Site configuration information for perl v5.8.8:
Configured by Debian Project at Wed Dec 6 23:17:41 UTC 2006.
Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux\, osvers=2.6.18.3\, archname=i486-linux-gnu-thread-multi uname='linux saens 2.6.18.3 #1 smp sat nov 25 13:39:52 est 2006 i686 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended\, useposix=true\, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n\, bincompat5005=undef Compiler: cc='cc'\, ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\, optimize='-O2'\, cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion=''\, gccversion='4.1.2 20061115 (prerelease) (Debian 4.1.1-20)'\, gccosandvers='' intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234 d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12 ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8 alignbytes=4\, prototype=define Linker and Libraries: ld='cc'\, ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.6.so\, so=so\, useshrplib=true\, libperl=libperl.so.5.8.8 gnulibc_version='2.3.6' Dynamic Linking: dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-Wl\,-E' cccdlflags='-fPIC'\, lddlflags='-shared -L/usr/local/lib'
Locally applied patches:
@INC for perl v5.8.8: /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .
Environment for perl v5.8.8: HOME=/home/dkr LANG=en_US.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/sbin:/sbin:/usr/bin:/bin:/home/dkr/bin:/usr/local/bin:/usr/local/sbin:/usr/X11R6/bin:/usr/games PERL_BADLANG (unset) SHELL=/bin/tcsh
-- _.\,-*~`^'~*-\,._ Danny Rathjens _.\,-*~`^'~*-\,._ FireCast: Rock solid kiosk software: http://www.wirespring.com/
Thanks\, I've reworked the docs accordingly to your suggestion as change #31462 to bleadperl.
The RT System itself - Status changed from 'new' to 'open'
@rgs - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#43287 (status was 'resolved')
Searchable as RT43287$