Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.9k stars 541 forks source link

Re: Towards line disciplines, patch two #1720

Closed p5pRT closed 20 years ago

p5pRT commented 24 years ago

Migrated from rt.perl.org#2970 (status was 'resolved')

Searchable as RT2970$

p5pRT commented 24 years ago

From @simoncozens

This patch makes binmode FH\, "​:utf8" work with the readline operator. It sits on top of the previous one.

Inline Patch ```diff --- /dev/null Thu Jan 1 09:00:00 1970 +++ perl-vexp/t/io/discipline.t Sun Apr 2 23:58:00 2000 @@ -0,0 +1,46 @@ +#!./perl + +BEGIN { + chdir 't' if -d 't'; + unshift @INC, '../lib'; +} + +# $RCSfile$ +$| = 1; +use warnings; + +print "1..4\n"; + +my $test = 1; + +sub ok { print "ok $test\n"; $test++ } + +# 1..4 +{ + my $unistr = 100.200.300.400.500.600.10; + open (OUT, ">testyfish") or die $!; + print OUT $unistr; + my $out = length $unistr; + close OUT; + print "not " unless $out == 7; + ok; + + open (IN, "testyfish") or die $!; + my $bytestr = ; + close IN; + my $in = length $bytestr; + print "not " if $out == $in; + ok; + + open (IN, "testyfish") or die $!; + binmode(IN, ":utf8") or print "not "; + ok; + $bytestr = ; + close IN; + $in = length $bytestr; + print "not " unless $out == $in; + ok; + +} + +unlink("testyfish"); --- perl-exp/pod/perlfunc.pod Sun Apr 2 16:52:13 2000 +++ perl-vexp/pod/perlfunc.pod Mon Apr 3 00:00:32 2000 @@ -448,7 +448,9 @@ text files. If FILEHANDLE is an expression, the value is taken as the name of the filehandle. DISCIPLINE can be either of C<":raw"> for binary mode or C<":crlf"> for "text" mode. If the DISCIPLINE is -omitted, it defaults to C<":raw">. +omitted, it defaults to C<":raw">. Additionally, C<":utf8"> stipulates +that data coming in from that filehandle via the readline operator is +UTF-8 encoded. binmode() should be called after open() but before any I/O is done on the filehandle. --- perl-exp/op.h Sun Apr 2 16:52:04 2000 +++ perl-vexp/op.h Sun Apr 2 23:50:55 2000 @@ -198,8 +198,10 @@ #define OPpDONE_SVREF 64 /* Been through newSVREF once */ /* Private for OP_OPEN and OP_BACKTICK */ -#define OPpOPEN_IN_RAW 16 /* binmode(F,":raw") on input fh */ -#define OPpOPEN_IN_CRLF 32 /* binmode(F,":crlf") on input fh */ +#define OPpOPEN_IN_UTF8 4 /* binmode(F,":utf8") on input fh */ +#define OPpOPEN_IN_RAW 8 /* binmode(F,":raw") on input fh */ +#define OPpOPEN_IN_CRLF 16 /* binmode(F,":crlf") on input fh */ +#define OPpOPEN_OUT_UTF8 32 /* binmode(F,":utf8") on output fh */ #define OPpOPEN_OUT_RAW 64 /* binmode(F,":raw") on output fh */ #define OPpOPEN_OUT_CRLF 128 /* binmode(F,":crlf") on output fh */ --- perl-exp/sv.h Sun Apr 2 17:25:52 2000 +++ perl-vexp/sv.h Sun Apr 2 23:06:02 2000 @@ -376,6 +376,7 @@ #define IOf_UNTAINT 16 /* consider this fp (and its data) "safe" */ #define IOf_NOLINE 32 /* slurped a pseudo-line from empty file */ #define IOf_FAKE_DIRP 64 /* xio_dirp is fake (source filters kludge) */ +#define IOf_UTF8 128 /* anything you read from this is Unicode */ /* Line disciplines - more general than just O_* modes */ #define DISC_TEXT 1 /* Default is binary */ --- perl-exp/doio.c Sun Apr 2 17:11:40 2000 +++ perl-vexp/doio.c Sun Apr 2 23:07:57 2000 @@ -1031,8 +1031,9 @@ } int -Perl_do_binmode(pTHX_ PerlIO *fp, int iotype, int mode) +Perl_do_binmode(pTHX_ PerlIO *fp, int iotype, int disc) { + int mode = mode_from_discipline(disc); #ifdef DOSISH # if defined(atarist) || defined(__MINT__) if (!PerlIO_flush(fp)) { --- perl-exp/op.c Sun Apr 2 17:15:25 2000 +++ perl-vexp/op.c Sun Apr 2 23:32:10 2000 @@ -5869,23 +5869,27 @@ HV *table = GvHV(PL_hintgv); if (table) { SV **svp; - I32 mode; + I32 disc; svp = hv_fetch(table, "open_IN", 7, FALSE); if (svp && *svp) { - mode = mode_from_discipline(parse_discipline(*svp)); - if (mode & O_BINARY) - o->op_private |= OPpOPEN_IN_RAW; - else if (mode & O_TEXT) + disc = parse_discipline(*svp); + if (disc & DISC_UTF8) + o->op_private |= OPpOPEN_IN_UTF8; + if (disc & DISC_TEXT) o->op_private |= OPpOPEN_IN_CRLF; + else + o->op_private |= OPpOPEN_IN_RAW; } svp = hv_fetch(table, "open_OUT", 8, FALSE); if (svp && *svp) { - mode = mode_from_discipline(parse_discipline(*svp)); - if (mode & O_BINARY) - o->op_private |= OPpOPEN_OUT_RAW; - else if (mode & O_TEXT) + disc = parse_discipline(*svp); + if (disc & DISC_UTF8) + o->op_private |= OPpOPEN_OUT_UTF8; + if (disc & DISC_TEXT) o->op_private |= OPpOPEN_OUT_CRLF; + else + o->op_private |= OPpOPEN_OUT_RAW; } } if (o->op_type == OP_BACKTICK) --- perl-exp/pp_hot.c Sun Apr 2 16:52:17 2000 +++ perl-vexp/pp_hot.c Sun Apr 2 23:53:28 2000 @@ -1407,6 +1407,8 @@ TAINT; SvTAINTED_on(sv); } + if (IoFLAGS(io) & IOf_UTF8) + SvUTF8_on(sv); IoLINES(io)++; SvSETMAGIC(sv); XPUSHs(sv); --- perl-exp/pp_sys.c Sun Apr 2 17:14:59 2000 +++ perl-vexp/pp_sys.c Sun Apr 2 23:52:25 2000 @@ -693,6 +693,7 @@ PerlIO *fp; MAGIC *mg; SV *discp = Nullsv; + I32 disc; if (MAXARG < 1) RETPUSHUNDEF; @@ -718,7 +719,10 @@ if (!(io = GvIO(gv)) || !(fp = IoIFP(io))) RETPUSHUNDEF; - if (do_binmode(fp,IoTYPE(io),mode_from_discipline(parse_discipline(discp)))) + disc = parse_discipline(discp); + if (disc & DISC_UTF8) + IoFLAGS(io) |= IOf_UTF8; + if (do_binmode(fp,IoTYPE(io),disc)) RETPUSHYES; else RETPUSHUNDEF; ```
p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

Simon Cozens writes​:

This patch makes binmode FH\, "​:utf8" work with the readline operator.

I do not think we want to do this. I do not use TCL for many years\, but my impression is that TCL users *pray* for the fconfigure command. Using this one command\, one can change *all* the characteristics of the stream with a consistent and easy-to-remember syntax.

I think we want​:

  a) add

  $fh->configure(buffering => ???\, crlf => ???\, encoding => ???\,   block => ?\, buffersize => ???);

  method (possibly with fine tuning between I/ and /O parts);

  b) Have a cheaper (in the sense of not loading IO.pm) way of access   to ->configure();

While 'a' does not contradict overloading binmode()\, it would be nicer to have a kind of a 'builtin' ->configure.

Ilya

p5pRT commented 24 years ago

From @TimToady

Ilya Zakharevich writes​: : Simon Cozens writes​: : > : > This patch makes binmode FH\, "​:utf8" work with the readline operator. : : I do not think we want to do this. I do not use TCL for many years\, : but my impression is that TCL users *pray* for the fconfigure command. : Using this one command\, one can change *all* the characteristics of : the stream with a consistent and easy-to-remember syntax.

It is impossible to know what all the characteristics of the stream are. The idea of a discipline is that you can swap it in and it will do magic that the designers of the system never imagined\, much like source filters.

: I think we want​: : : a) add : : $fh->configure(buffering => ???\, crlf => ???\, encoding => ???\, : block => ?\, buffersize => ???); : : method (possibly with fine tuning between I/ and /O parts); : : b) Have a cheaper (in the sense of not loading IO.pm) way of access : to ->configure(); : : While 'a' does not contradict overloading binmode()\, it would be nicer : to have a kind of a 'builtin' ->configure.

I think this is somewhat orthogonal to disciplines. You're imagining one complex discipline that is highly configurable. I'm thinking of smaller\, stackable\, hot-swappable disciplines that will be simple and fast. But I suppose there's no reason not to have a complex discipline that is more configurable\, if you want to invent one.

Larry

p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

On Mon\, Apr 03\, 2000 at 07​:50​:01AM -0700\, Larry Wall wrote​:

It is impossible to know what all the characteristics of the stream are. The idea of a discipline is that you can swap it in and it will do magic that the designers of the system never imagined\, much like source filters.

Why does it preclude having a ->configure method in a discipline?

: a) add : : $fh->configure(buffering => ???\, crlf => ???\, encoding => ???\, : block => ?\, buffersize => ???);

I think this is somewhat orthogonal to disciplines. You're imagining one complex discipline that is highly configurable.

I'm not imagining anything. The configuration above is what we do *now* by zillions of absolutely different and mostly obscure way. Well\, we do not have 'encoding'\, but we have irs/ors "instead". ;-)

Eventually a filehandle would become a stack of disciplines. Each of the disciplines may have a configuration methods. But this does not preclude having a ->configure method which acts on *the whole stack*\, and not on separate disciplines.

The configuration above is what people do with "plain" systemish filehandles. Many (most?) of filehandles are going to be either naked systemish handles\, or stacks which terminate in such a systemish handle. When you ->configure() a stack\, the configuration-hash may be filtered by each discipline of the stack. Most disciplines would be able to pass unrecognised options down the chain. Ergo​: many (most?) filehandles will support the above configuration options.

I'm thinking of smaller\, stackable\, hot-swappable disciplines that will be simple and fast.

You cannot go simpler and smaller than system filedescriptors.

Ilya