Closed p5pRT closed 13 years ago
As I discussed a few weeks ago with Father Chrysostomos\, I volunteered to submit a patch converting Latin-1 files in the Perl source tree to UTF-8. I've just finished the patch (attached).
For a lot of the files\, this was straightforward; most of the changes involved either copyright symbols or Lord of the Rings quotations in header comments.
I do have some doubts about some of the changes I made\, and I encourage anyone who's more familiar with this than I am to review the changes I've made. In particular\, I'm not sure about the files under the cpan/ directory; I suppose there needs to be some coordination with the corresponding sources in CPAN itself; I don't know how that works. Perhaps it woudl be easier to leave those changes out.
There are a number of files that have encodings other than UTF-8 that I didn't touch\, mostly because there seem to be specific requirements to use those encodings. The files are listed in the attached file "skipped.txt".
This patch did cause one test failure\, in porting/cmp_version. Running the test by itself shows the following:
# diff --git a/ext/attributes/attributes.xs b/ext/attributes/attributes.xs # index 24f5f61..3900c36 100644 # --- a/ext/attributes/attributes.xs # +++ b/ext/attributes/attributes.xs # @@ -12\,7 +12\,7 @@ # * 'Perilous to us all are the devices of an art deeper than we possess # * ourselves.' --Gandalf # * # - * [p.597 of _The Lord of the Rings_\, III/xi: "The Palant�r"] # + * [p.597 of _The Lord of the Rings_\, III/xi: "The Palantír"] # */ # # #define PERL_NO_GET_CONTEXT not ok 25 - ext/attributes/attributes.pm
It's complaining about the change in the "attributes/attributes.xs" file\, which should be resolved if this patch is committed.
-- Keith Thompson \Keith\.S\.Thompson@​gmail\.com
README.cn README.jp README.ko cpan/CGI/t/html.t cpan/CGI/t/upload_post_text.txt cpan/Encode/lib/Encode/CJKConstants.pm cpan/Encode/lib/Encode/JP/H2Z.pm cpan/Encode/t/Mod_EUCJP.pm cpan/Encode/t/at-cn.t cpan/Encode/t/at-tw.t cpan/Encode/t/big5-eten.enc cpan/Encode/t/big5-hkscs.enc cpan/Encode/t/enc_data.t cpan/Encode/t/enc_module.enc cpan/Encode/t/enc_module.t cpan/Encode/t/gb2312.enc cpan/Encode/t/jisx0201.enc cpan/Encode/t/jisx0208.enc cpan/Encode/t/jisx0212.enc cpan/Encode/t/jperl.t cpan/Encode/t/ksc5601.enc cpan/Encode/t/mime_header_iso2022jp.t cpan/PerlIO-via-QuotedPrint/t/QuotedPrint.t cpan/Pod-Parser/lib/Pod/Checker.pm cpan/Pod-Simple/t/corpus/8859_7.pod cpan/Pod-Simple/t/corpus/cp1256.txt cpan/Pod-Simple/t/corpus/fet_cont.txt cpan/Pod-Simple/t/corpus/fet_dup.txt cpan/Pod-Simple/t/corpus/iso6.txt cpan/Pod-Simple/t/corpus/koi8r.txt cpan/Pod-Simple/t/corpus/laozi38.txt cpan/Pod-Simple/t/corpus/laozi38b.txt cpan/Pod-Simple/t/corpus/laozi38p.pod cpan/Pod-Simple/t/corpus/lat1fr.txt cpan/Pod-Simple/t/corpus/lat1frim.txt cpan/Pod-Simple/t/corpus/pasternak_cp1251.txt cpan/Pod-Simple/t/corpus/s2763_sjis.txt cpan/Pod-Simple/t/corpus/thai_iso11.txt cpan/Pod-Simple/t/corpus2/fiqhakbar_iso6.txt cpan/Pod-Simple/t/encod02.t cpan/Pod-Simple/t/pulltitl.t cpan/Pod-Simple/t/testlib1/Zonk/Pronk.pm cpan/Sys-Syslog/win32/PerlLog.mc cpan/Unicode-Collate/t/loc_test.t cpan/podlators/t/man.t dist/Storable/t/utf8hash.t lib/utf8.t t/io/utf8.t t/lib/locale/latin1 t/lib/warnings/utf8 t/op/lc.t t/op/utfhash.t t/uni/greek.t t/uni/latin2.t t/uni/tr_eucjp.t t/uni/tr_sjis.t
On Wed Sep 07 19:11:35 2011\, keithsthompson@gmail.com wrote:
As I discussed a few weeks ago with Father Chrysostomos\, I volunteered to submit a patch converting Latin-1 files in the Perl source tree to UTF-8. I've just finished the patch (attached).
For a lot of the files\, this was straightforward; most of the changes involved either copyright symbols or Lord of the Rings quotations in header comments.
I do have some doubts about some of the changes I made\, and I encourage anyone who's more familiar with this than I am to review the changes I've made. In particular\, I'm not sure about the files under the cpan/ directory; I suppose there needs to be some coordination with the corresponding sources in CPAN itself; I don't know how that works. Perhaps it woudl be easier to leave those changes out.
For those files\, CPAN is upstream. They usually remain untouched between upgrades to newer versions. Any changes have to be made to the CPAN distributions first. We sometimes make exceptions for test failures or modules used to bootstrap perl itself.
There are a number of files that have encodings other than UTF-8 that I didn't touch\, mostly because there seem to be specific requirements to use those encodings. The files are listed in the attached file "skipped.txt".
I agree that those should be skipped.
This patch did cause one test failure\, in porting/cmp_version. Running the test by itself shows the following: ... It's complaining about the change in the "attributes/attributes.xs" file\, which should be resolved if this patch is committed.
It’s complaining that it changed without the version number in ext/attributes/attributes.pm changing.
I’ve applied your patch\, but without the cpan/ changes and without the NamesList.txt changes (that file is from the Unicode Consortium and is simply plopped in as it is)\, as cdad3b53476\, followed by a version bump for attributes.pm in commit 83e49ee07c6.
Thank you.
The RT System itself - Status changed from 'new' to 'open'
@cpansprout - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#98666 (status was 'resolved')
Searchable as RT98666$