Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

Patch: Update AUTHORS and convert from Latin-1 to UTF-8 #11575

Closed p5pRT closed 13 years ago

p5pRT commented 13 years ago

Migrated from rt.perl.org#96924 (status was 'resolved')

Searchable as RT96924$

p5pRT commented 13 years ago

From keithsthompson@gmail.com

It's not 100% clear that you'll want the AUTHORS file to be in UTF-8 rather than Latin-1. In my opinion it should be\, both because UTF-8 is more flexible\, and because there's at least one author whose name cannot be represented in Latin-1. I've also updated my own e-mail address (the old and new addresses are both valid\, but I prefer to use my gmail address for this kind of thing.) If you don't want to deal with the character encoding\, please update that line.

Incidentally\, there are a number of files in the Perl source tree that are encoded in Latin-1 (ISO-8859-1)\, and a number that are in UTF-8. Running "file" on everything shows 70 Latin-1 files and 58 UTF-8 files\, but that may be incomplete. Copyright symbols probably account for a lot of them.

If there's a consensus that all the Latin-1 files should be converted to UTF-8\, I'll volunteer to do the conversion and submit a patch.

-- Keith Thompson \Keith\.S\.Thompson@​gmail\.com

p5pRT commented 13 years ago

From keithsthompson@gmail.com

0001-Update-AUTHORS-file-and-convert-from-Latin-1-to-UTF-.patch ```diff From 545313a22b00c462860c454c6406138877a5be85 Mon Sep 17 00:00:00 2001 From: Keith Thompson Date: Sat, 13 Aug 2011 16:08:22 -0700 Subject: [PATCH] Update AUTHORS file and convert from Latin-1 to UTF-8 Update my own e-mail address Fix name for (not representable in Latin-1) as seen on --- AUTHORS | 70 +++++++++++++++++++++++++++++++------------------------------- 1 files changed, 35 insertions(+), 35 deletions(-) diff --git a/AUTHORS b/AUTHORS index 4508af5..75ebdc9 100644 --- a/AUTHORS +++ b/AUTHORS @@ -69,7 +69,7 @@ Ananth Kesari Anders Johnson Andreas Karrer Andreas Klussmann -Andreas König +Andreas König Andreas Marienborg Andreas Schwab Andrei Yelistratov @@ -106,7 +106,7 @@ Artiom Morozov Artur Bergman Arvan Ash Berlin -Ask Bjöern Hansen +Ask Bjöern Hansen Audrey Tang Axel Boldt Barrie Slaymaker @@ -229,7 +229,7 @@ Craig DeForest Craig Milo Rogers Curtis Poe Curtis Jewell -Dagfinn Ilmari Mannsåker +Dagfinn Ilmari MannsÃ¥ker Dale Amon Damian Conway Damon Atkins @@ -246,7 +246,7 @@ Daniel Chetlin Daniel Frederick Crisman Daniel Grisinger Daniel Lieberman -Daniel Muiño +Daniel Muiño Daniel P. Berrange Daniel S. Lewart Daniel Yacob @@ -360,8 +360,8 @@ Frank Tobin Frank Wiegand Franklin Chen Franz Fasching -François Désarménien -Fréderic Chauveau +François Désarménien +Fréderic Chauveau Fyodor Krasnov G. Del Merritt Gabe Schaffer @@ -440,7 +440,7 @@ Iain Truskett Ian Goodacre Ian Maloney Ian Phillipps -Ignasi Roca Carrió +Ignasi Roca Carrió Igor Sutton Ilmari Karonen Ilya Martynov @@ -450,7 +450,7 @@ Ilya Zakharevich Inaba Hiroto Indy Singh Ingo Weinhold -Ingy döt Net +Ingy döt Net insecure Irving Reid Ivan Kurmanov @@ -508,7 +508,7 @@ Jerry D. Hedden Jesse Glick Jesse Luehrs Jesse Vincent -Jesús Quiroga +Jesús Quiroga Jim Anderson Jim Avera Jim Balter @@ -517,7 +517,7 @@ Jim Meyering Jim Miner Jim Richardson Jim Schneider -Jirka HruÅ¡ka +Jirka HruÅ¡ka Joachim Huober Jochen Wiedmann Jody Belka @@ -588,8 +588,8 @@ juna Jungshik Shin Justin Banks John E. Malmberg -Jörg Walter -José Pedro Oliveira +Jörg Walter +José Pedro Oliveira Ka-Ping Yee Kaoru Maeda Karl Glazebrook @@ -598,10 +598,10 @@ Karl Simon Berg Karl Williamson Karsten Sperling Kaveh Ghazi -Kay Röpke +Kay Röpke KAWAI Takanori Keith Neufeld -Keith Thompson +Keith Thompson Ken Estes Ken Fox Ken Hirsch @@ -637,7 +637,7 @@ Larry Shatzer Larry W. Virden Larry Wall Lars Hecking -Lars D¿¿¿¿¿¿ ¿¿¿ +Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯 Laszlo Molnar Larwan Berke Leif Huhn @@ -659,13 +659,13 @@ Lubomir Rintel Lupe Christoph Luther Huffman Maik Hentsche -Major Sébastien +Major Sébastien Makoto MATSUSHITA Malcolm Beattie Manuel Valente Marc Lehmann Marc Paquette -Marcel Grünauer +Marcel Grünauer Marcus Holland-Moritz Marek Rouchal Mark A Biggar @@ -781,7 +781,7 @@ Neale Ferguson Neil Bowers Neil Watkiss Nicholas Clark -Nicholas Oxhøj +Nicholas Oxhøj Nicholas Perez Nick Cleaton Nick Duffek @@ -837,7 +837,7 @@ Paul Rogers Paul Saab Paul Schinder Paul Szabo -Pavel Ka¿kovský +Pavel Ka¿kovský Pavel Zakouril Pedro Felipe Horrillo Guerra Per Einar Ellefsen @@ -857,7 +857,7 @@ Peter O'Gorman Peter Prymmer Peter Rabbitson Peter Scott -Peter Valdemar Mørch +Peter Valdemar Mørch Peter van Heusden Peter Wolfe Peter E. Yee @@ -907,7 +907,7 @@ Richard Hitt Richard Kandarian Richard L. England Richard L. Maus, Jr. -Richard Möhn +Richard Möhn Richard Ohnemus Richard Soderberg Richard Yeh @@ -942,13 +942,13 @@ Russ Allbery Russell Fulton Russell Mosemann Ryan Herbert -Salvador Fandiño +Salvador Fandiño Salvador Ortiz Garcia Sam Kimbrel Sam Tregar Sam Vilain Samuel Thibault -Samuli Kärkkäinen +Samuli Kärkkäinen Schuyler Erle Scott A Crosby Scott Bronson @@ -964,11 +964,11 @@ Sean M. Burke Sean Robinson Sean Sheedy Sebastian Wittmeier -Sébastien Aperghis-Tramoni +Sébastien Aperghis-Tramoni Sebastien Barre Sebastian Schmidt Sebastian Steinlechner -Sérgio Durigan Júnior +Sérgio Durigan Júnior Shawn Shawn M Moore Sherm Pendley @@ -991,9 +991,9 @@ Spider Boardman Spiros Denaxas Sreeji K Das Stas Bekman -Steffen Müller +Steffen Müller Steffen Ullrich -Stéphane Payrard +Stéphane Payrard Stepan Kasal Stephane Payrard Stephanie Beals @@ -1032,10 +1032,10 @@ Tels Teun Burgers Thad Floryan Thomas Bowditch -Thomas Conté +Thomas Conté Thomas Dorner Thomas Kofler -Thomas König +Thomas König Thomas Pfau Thomas Wegner Thorsten Glaser @@ -1072,7 +1072,7 @@ Tony Cook Tony Sanders Tor Lillqvist Torsten Foertsch -Torsten Schönfeld +Torsten Schönfeld Trevor Blackwell Tuomas J. Lukka Tsutomu IKEGAMI @@ -1083,7 +1083,7 @@ Ulrich Pfeifer Vadim Konovalov Valeriy E. Ushakov Vernon Lyon -Ville Skyttä +Ville Skyttä Vincent Pit Vishal Bhatia Vlad Harchev @@ -1098,7 +1098,7 @@ Warren Jones Wayne Berke Wayne Scott Wayne Thompson -Wilfredo Sánchez +Wilfredo Sánchez William J. Middleton William Mann William Middleton @@ -1106,7 +1106,7 @@ William R Ward William Setzer William Williams William Yardley -Winfried König +Winfried König Wolfgang Laun Wolfram Humann Xavier Noria @@ -1121,6 +1121,6 @@ Yuval Kogman Yves Orton Zachary Miller Zefram -Zsbán Ambrus +Zsbán Ambrus Zbynek Vyskovsky -Ævar Arnfjörð Bjarmason +Ævar Arnfjörð Bjarmason -- 1.7.4.5 ```
p5pRT commented 13 years ago

From @cpansprout

On Sat Aug 13 16​:22​:08 2011\, keithsthompson@​gmail.com wrote​:

It's not 100% clear that you'll want the AUTHORS file to be in UTF-8 rather than Latin-1. In my opinion it should be\, both because UTF-8 is more flexible\, and because there's at least one author whose name cannot be represented in Latin-1. I've also updated my own e-mail address (the old and new addresses are both valid\, but I prefer to use my gmail address for this kind of thing.) If you don't want to deal with the character encoding\, please update that line.

I have applied your patch as 055f85571. I omitted the change to Jirka Hruška\, though\, as it was already in UTF-8 (which is why I didn’t hesitate to apply the rest of your patch).

Do you have any idea what the missing letter is in Pavel Ka¿kovský’s name?

Incidentally\, there are a number of files in the Perl source tree that are encoded in Latin-1 (ISO-8859-1)\, and a number that are in UTF-8. Running "file" on everything shows 70 Latin-1 files and 58 UTF-8 files\, but that may be incomplete. Copyright symbols probably account for a lot of them.

If there's a consensus that all the Latin-1 files should be converted to UTF-8\, I'll volunteer to do the conversion and submit a patch.

I suggested exactly the same change a week ago. I haven’t heard any objections.

p5pRT commented 13 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 13 years ago

@cpansprout - Status changed from 'open' to 'resolved'

p5pRT commented 13 years ago

From keithsthompson@gmail.com

It looks like Pavel's last name is Kankovský. I don't know why an ordinary ASCII 'n' would have been messed up.

\<http​://www.squid-cache.org/mail-archive/squid-dev/200106/0056.html>

] -- Keith

On Sun\, Aug 14\, 2011 at 2​:12 PM\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sat Aug 13 16​:22​:08 2011\, keithsthompson@​gmail.com wrote​:

It's not 100% clear that you'll want the AUTHORS file to be in UTF-8 rather than Latin-1. In my opinion it should be\, both because UTF-8 is more flexible\, and because there's at least one author whose name cannot be represented in Latin-1. I've also updated my own e-mail address (the old and new addresses are both valid\, but I prefer to use my gmail address for this kind of thing.) If you don't want to deal with the character encoding\, please update that line.

I have applied your patch as 055f85571. I omitted the change to Jirka Hruška\, though\, as it was already in UTF-8 (which is why I didn’t hesitate to apply the rest of your patch).

Do you have any idea what the missing letter is in Pavel Ka¿kovský’s name?

Incidentally\, there are a number of files in the Perl source tree that are encoded in Latin-1 (ISO-8859-1)\, and a number that are in UTF-8. Running "file" on everything shows 70 Latin-1 files and 58 UTF-8 files\, but that may be incomplete. Copyright symbols probably account for a lot of them.

If there's a consensus that all the Latin-1 files should be converted to UTF-8\, I'll volunteer to do the conversion and submit a patch.

I suggested exactly the same change a week ago. I haven’t heard any objections.

-- Keith Thompson \Keith\.S\.Thompson@&#8203;gmail\.com

p5pRT commented 13 years ago

From @arc

Keith Thompson \keithsthompson@&#8203;gmail\.com wrote​:

It looks like Pavel's last name is Kankovský.  I don't know why an ordinary ASCII 'n' would have been messed up.

\<http​://www.squid-cache.org/mail-archive/squid-dev/200106/0056.html>

Pavel's email address as it appears there suggests that he's Czech. The Czech alphabet uses both "ý" and "ň"\, and the latter looks like it could well appear in his name in place of a plain "n". Which leaves the question of why Pavel wrote his name with a plain "n" in that email. One possibility is that\, since "ý" is in ISO 8859-1 but "ň" isn't\, he was using a system that supported ISO 8859-1 but not 8859-2 (or one of the other character sets that can represent Czech text accurately).

However\, I should clarify that I don't know any Czech — I'm just guessing here. It would probably be better to contact Pavel and ask him how he'd like his name to appear in AUTHORS.

-- Aaron Crane ** http​://aaroncrane.co.uk/

p5pRT commented 13 years ago

From @cpansprout

On Tue Aug 16 07​:31​:40 2011\, arc wrote​:

Keith Thompson \keithsthompson@&#8203;gmail\.com wrote​:

It looks like Pavel's last name is Kankovský.  I don't know why an ordinary ASCII 'n' would have been messed up.

\<http​://www.squid-cache.org/mail-archive/squid-dev/200106/0056.html>

Pavel's email address as it appears there suggests that he's Czech. The Czech alphabet uses both "ý" and "ň"\, and the latter looks like it could well appear in his name in place of a plain "n". Which leaves the question of why Pavel wrote his name with a plain "n" in that email. One possibility is that\, since "ý" is in ISO 8859-1 but "ň" isn't\, he was using a system that supported ISO 8859-1 but not 8859-2 (or one of the other character sets that can represent Czech text accurately).

However\, I should clarify that I don't know any Czech — I'm just guessing here. It would probably be better to contact Pavel and ask him how he'd like his name to appear in AUTHORS.

Thank you for stating the obvious\, which\, for some reason\, I was too dense to see. I’m forwarding this to him from within RT.

p5pRT commented 13 years ago

From @ppisar

On 2011-08-14\, Father Chrysostomos via RT \perlbug\-followup@&#8203;perl\.org wrote​:

Do you have any idea what the missing letter is in Pavel Ka¿kovský’s name?

His name is Pavel Kaňkovský. I'm not sure he still owns address \kan@&#8203;dcit\.cz\, AFAIK he uses faculty mail box (http​://www.mff.cuni.cz/toISO-8859-2.en/fakulta/struktura/lide/937.htm) now.

-- Petr

p5pRT commented 13 years ago

From kan@dcit.cz

Oh\, I am afraid you guys have already invested far more effort into the quest for the correct spelling of my name than I deserve for my miniscule contribution to the development of Perl.

Aaron is right\, the third letter of my surname (in all its diacritical glory) is "n with caron" and the caron must have been lost in translation. [I have learned not to let Lotus Notes unleash the horrors of diacritical marks on unsuspecting audience since then.]

I guess the most efficient remedy (other than dropping me from the list altogether) would be to remove an acute from the final "y" and make the name fully ASCII-ized; I am quite content with that transcription ("What's in a name?"). But if you prefer to keep names as close as to their native form\, here is the 100% correct spelling of my name in UTF-8​:

perl -CO -e 'print "Pavel Ka\x{0148}kovsk\x{00fd}\n"'