Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

%42s uses sv_pos_b2u/sv_pos_u2b #10168

Open p5pRT opened 14 years ago

p5pRT commented 14 years ago

Migrated from rt.perl.org#72776 (status was 'new')

Searchable as RT72776$

p5pRT commented 14 years ago

From @nwc10

sv.c 9837​: sv_pos_u2b(argsv\, &p\, 0); /* sticks at end */

This is deep in the bowls of printf​:

  case 's'​:   if (vectorize)   goto unknown;   if (args) {   eptr = va_arg(*args\, char*);   if (eptr)   elen = strlen(eptr);   else {   eptr = (char *)nullstr;   elen = sizeof nullstr - 1;   }   }   else {   eptr = SvPV_const(argsv\, elen);   if (DO_UTF8(argsv)) {   STRLEN old_precis = precis;   if (has_precis && precis \< elen) {   STRLEN ulen = sv_len_utf8(argsv);   I32 p = precis > ulen ? ulen : precis;   sv_pos_u2b(argsv\, &p\, 0); /* sticks at end */   precis = p;   }   if (width) { /* fudge width (can't fudge elen) */   if (has_precis && precis \< elen)   width += precis - old_precis;   else   width += elen - sv_len_utf8(argsv);   }   is_utf8 = TRUE;   }   }

I *think* all the surroundings translate as %s\, but with a length constraint\, eg %42s\, where the constraint is less than the (octed) length of a UTF-8 string. In which case\, work out where the 42nd character goes up to.

So\, %2147483648s and similar aren't going to work (well) on UTF-8 strings over 2GB.

Nicholas Clark

p5pRT commented 14 years ago

From @nwc10

sv.c 9837​: sv_pos_u2b(argsv\, &p\, 0); /* sticks at end */

This is deep in the bowls of printf​:

  case 's'​:   if (vectorize)   goto unknown;   if (args) {   eptr = va_arg(*args\, char*);   if (eptr)   elen = strlen(eptr);   else {   eptr = (char *)nullstr;   elen = sizeof nullstr - 1;   }   }   else {   eptr = SvPV_const(argsv\, elen);   if (DO_UTF8(argsv)) {   STRLEN old_precis = precis;   if (has_precis && precis \< elen) {   STRLEN ulen = sv_len_utf8(argsv);   I32 p = precis > ulen ? ulen : precis;   sv_pos_u2b(argsv\, &p\, 0); /* sticks at end */   precis = p;   }   if (width) { /* fudge width (can't fudge elen) */   if (has_precis && precis \< elen)   width += precis - old_precis;   else   width += elen - sv_len_utf8(argsv);   }   is_utf8 = TRUE;   }   }

I *think* all the surroundings translate as %s\, but with a length constraint\, eg %42s\, where the constraint is less than the (octed) length of a UTF-8 string. In which case\, work out where the 42nd character goes up to.

So\, %2147483648s and similar aren't going to work (well) on UTF-8 strings over 2GB.

Nicholas Clark

p5pRT commented 14 years ago

From @cpansprout

Exact copy of 72776.

p5pRT commented 14 years ago

@cpansprout - Status changed from 'new' to 'resolved'