Closed p5pRT closed 8 years ago
The documentation in perlfunc for C\
If there is a consensus that this is the correct and intended behaviour\, then it should be documented (I'll be happy to write something). Myself\, I would greatly prefer hex() to return undef if the input isn't a valid hex number\, but it may be too late to change.
(If the current behaviour is kept\, is there any way to test whether the hex() call succeeded without problems? Other than by setting a __WARN__ handler. If there is such a trick\, it would be worth documenting too.)
Ed Avis wrote:
(If the current behaviour is kept\, is there any way to test whether the hex() call succeeded without problems?
You can always check /\A(?:0?[xX])?(?:_?[0-9a-fA-F])*\z/. Generally hex() is best used when by the nature of the way you acquired the input you already know it consists only of hex digits.
-zefram
The RT System itself - Status changed from 'new' to 'open'
Zefram \<zefram \
(If the current behaviour is kept\, is there any way to test whether the hex() call succeeded without problems?
You can always check /\A(?:0?[xX])?(?:_?[0-9a-fA-F])*\z/. Generally hex() is best used when by the nature of the way you acquired the input you already know it consists only of hex digits.
That's right\, and of course I am doing that.
Ah - and I see from your regexp that hex() does understand the 0x prefix\, which from a first reading of the documentation I assumed it wouldn't\, since only 'oct' coped with prefixes like 0\, 0x\, 0b... it is a bit ambiguous.
Generally even if I have extracted data with a regexp I still want to check that parsing has succeeded\, although as a 'cannot happen' condition it would just get a simple die; if parsing failed. This is just good practice in my view\, and stops a bug in one regexp cascading through the program. So C\<hex($x) // die> would be most convenient.
-- Ed Avis \eda@​waniasset\.com
On Fri\, Oct 23\, 2015 at 6:01 PM\, Ed Avis \perlbug\-followup@​perl\.org wrote:
(If the current behaviour is kept\, is there any way to test whether the hex() call succeeded without problems? Other than by setting a __WARN__ handler. If there is such a trick\, it would be worth documenting too.)
I'm not sure about documenting it\, but ā¦
eval { use warnings "FATAL"; hex $foo }
Though personally\, I guess I'd trim (or at least chomp()) $foo first:
$ perl -E 'say eval { use warnings "FATAL"; hex(s/^\s+|\s+$//rg) } // "undef: $@" for @ARGV' deadbeef "" bar " 1a " 3735928559 0 undef: Illegal hexadecimal digit 'r' ignored at -e line 1.
26 $
Eirik
On Fri\, Oct 23\, 2015 at 6:24 PM\, Ed Avis \eda@​waniasset\.com wrote:
So C\<hex($x) // die> would be most convenient.
do { use warnings "FATAL"; hex($x) } # ;-)
Eirik
Proposed patch codifying the current behaviour:
index f0a2abb..1811d8b 100644
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -3031\,16 +3031\,31 @@ X\
Interprets EXPR as a hex string and returns the corresponding value. -(To convert strings that might start with either C\<0>\, C\<0x>\, or C\<0b>\, see -L\.) If EXPR is omitted\, uses C\<$_>. +An initial C\<0x> prefix is stripped. (See L\ for a way to automatically +convert strings starting with any of C\<0>\, C\<0x>\, or C\<0b>.) +If EXPR is omitted\, uses C\<$_>.
print hex '0xAf'; # prints '175' print hex 'aF'; # same
Hex strings may only represent integers. Strings that would cause
-integer overflow trigger a warning. Leading whitespace is not stripped\,
-unlike oct(). To present something as hex\, look into L\\,
-L\\, and L\.
+integer overflow trigger a warning.
+
+If the input is not a valid hex string then C\
=item import LIST
X\
...another question is to what extent 'x' instead of '0x' is supported? Should the docs for both hex and oct change to list both variants? Ditto 'b' vs '0b'.
On Mon\, Oct 26\, 2015 at 08:10:34AM -0700\, Ed Avis via RT wrote:
Proposed patch codifying the current behaviour:
index f0a2abb..1811d8b 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -3031\,16 +3031\,31 @@ X\
X\ =for Pod::Functions convert a string to a hexadecimal number Interprets EXPR as a hex string and returns the corresponding value. -(To convert strings that might start with either C\<0>\, C\<0x>\, or C\<0b>\, see -L\.) If EXPR is omitted\, uses C\<$_>. +An initial C\<0x> prefix is stripped. (See L\ for a way to automatically +convert strings starting with any of C\<0>\, C\<0x>\, or C\<0b>.) +If EXPR is omitted\, uses C\<$_>.
print hex '0xAf'; \# prints '175' print hex 'aF'; \# same
Hex strings may only represent integers. Strings that would cause -integer overflow trigger a warning. Leading whitespace is not stripped\, -unlike oct(). To present something as hex\, look into L\\, -L\\, and L\. +integer overflow trigger a warning. + +If the input is not a valid hex string then C\
takes the longest +prefix of the input that is a valid hex string and converts that. A warning +is given for the first invalid character seen. So if you want to make +sure that C\ succeeds\, you can either trap the warning: + + $v = do { use warnings 'FATAL'; hex $x }; + +or alternatively check the input by hand first: + + $x =~ /\A(?:0?[xX])?(?:_?[0-9a-fA-F])*\z/ + or die "bad hex string $x"; + $v = hex $x; + +Leading whitespace is not stripped\, unlike oct(). To present +something as hex\, look into L\\, L\\, and L\.
I find the phrase "If you want to make sure that hex succeeds..." confusing. Both code snippets you present do exactly the opposite: they die.
I'd also use "ignored" where you say "stripped". Because in
my $hex = "0xF00"; say hex $hex; say $hex;
the second line still prints the leading '0x'\, and nothing gets stripped.
Abigail
Thanks for your comments; here's a revised patch. Note that perlfunc often uses 'stripped' to mean that certain data is ignored and not included in the output of a function\, without implying that the function destructively modifies its input. Within the documentation for 'hex'\, I have standardized on 'ignored' as you suggest.
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -3031\,16 +3031\,40 @@ X\
Interprets EXPR as a hex string and returns the corresponding value.
-(To convert strings that might start with either C\<0>\, C\<0x>\, or C\<0b>\, see
-L\.) If EXPR is omitted\, uses C\<$_>.
+A hex string has an optional C\
print hex '0xAf'; # prints '175' print hex 'aF'; # same
Hex strings may only represent integers. Strings that would cause
-integer overflow trigger a warning. Leading whitespace is not stripped\,
-unlike oct(). To present something as hex\, look into L\\,
-L\\, and L\.
+integer overflow trigger a warning.
+
+If the input is not a valid hex string then C\
=item import LIST
X\
On Fri Oct 23 09:01:10 2015\, eda@waniasset.com wrote:
The documentation in perlfunc for C\
says nothing about the behaviour when the string passed isn't a valid hex number. What seems to happen is that it takes the longest initial substring of the input that is a valid hex number (counting the empty string as valid). If warnings are enabled\, you see "Illegal hexadecimal digit 'z' ignored"\, but only for the first bad character.
The behaviour is inconsistent - it croaks if the string contains a wide character:
tony@mars:.../git/perl$ ./perl -le 'print hex("x20_00\x{101}")' Wide character in hex at -e line 1.
which is probably a bug.
Tony
The behaviour is inconsistent - it croaks if the string contains a wide character:
Yet this appears deliberate: there is a test in t/oct.t which makes sure that wide characters cause an exception. On the other hand there are no test cases for ASCII\, but non-hex-digit characters in the string.
Should the behaviour be made consistent so that all bad characters\, wide or not\, throw an exception? Or should it be that all bad characters just raise a warning and the longest valid initial substring is taken as the hex input? Or do we just codify and test the current somewhat wonky semantics?
If the perl5-porters can indicate the preferred way forward then I will include some test cases as well as the documentation change.
* Ed Avis via RT \perlbug\-followup@​perl\.org [2015-10-26 16:45]:
Thanks for your comments; here's a revised patch.
This triples the length of the entry. Ignoring the wide character bug\, would the docs have satisfied you if they had been as in this patch?
I like your more concise wording. But I think it is useful to have Zefram's regexp of valid hex numbers and an idiot-proof example of how to use it; or alternatively the example of how to check that hex() succeeded by trapping warnings. Perhaps it is overkill to mention both\, but one or the other should be there. Just as much of the documentation on open() talks about how to make sure it succeeded\, so hex() needs to suggest and encourage good error checking practices.
On Tue\, Nov 3\, 2015 at 7:36 AM Ed Avis via RT \perlbug\-followup@​perl\.org wrote:
I like your more concise wording. But I think it is useful to have Zefram's regexp of valid hex numbers and an idiot-proof example of how to use it; or alternatively the example of how to check that hex() succeeded by trapping warnings. Perhaps it is overkill to mention both\, but one or the other should be there. Just as much of the documentation on open() talks about how to make sure it succeeded\, so hex() needs to suggest and encourage good error checking practices.
--- via perlbug: queue: perl5 status: open https://rt-archive.perl.org/perl5/Ticket/Display.html?id=126437
Here is my take. It includes all of the variant headers and examples of each (using a loop to save space).
I also stripped the statement that only integers can be represented (the lack of . character means you can't represent real numbers and the integer overflow warning should be enough to let the user know the backend type used to represent the number).
* Ed Avis via RT \perlbug\-followup@​perl\.org [2015-11-03 13:40]:
I like your more concise wording. But I think it is useful to have Zefram's regexp of valid hex numbers and an idiot-proof example of how to use it; or alternatively the example of how to check that hex() succeeded by trapping warnings. Perhaps it is overkill to mention both\, but one or the other should be there. Just as much of the documentation on open() talks about how to make sure it succeeded\, so hex() needs to suggest and encourage good error checking practices.
Actually\, youāre right.
I donāt think an explanation of the fact that you can fatalise warnings belongs here. Do we add that to every function that can throw warnings? If not\, what makes hex() different? What level of familiarity with which parts of Perl can/should/do we expect from a reader\, and why?
But even for the regexp I was reluctant\, since it seemed redundant with the prose specification (and I think itās important that there be a spec in precise prose). However\, it does make sense not to make every reader grok the spec from scratch and figure out how to write that (slightly tricky) pattern. That information is also not redundant with other parts of the documentation nor is it just a basic language feature.
So I went to propose adding just this line to the code examples (to keep things brief):
$valid_input =~ /\A(?:0?[xX])?(?:_?[0-9a-fA-F])*\z/
But when I tried that on for size\, I realised that the prose and pattern are actually complementary: each explains the other. So I now think that that *has* to be there.
Does that work for you?
Regards\, -- Aristotle Pagaltzis // \<http://plasmasturm.org/>
It depends on whether you see 'perlfunc' as reference or tutorial. As a reference or spec for Perl's builtin functions\, it certainly should not repeat the same information about fatal warnings in random places. If a tutorial\, then the emphasis should be on examples and reminding the reader of useful related information\, even if that causes some redundancy. 'perlopentut' has lots of repeated boilerplate showing how to use die and $! even though in principle it would be enough to mention just once that open() returns false on failure and sets $!.
If the regexp is included then that takes care of showing how to check that hex() will succeed\, so an example of trapping warnings from it is less necessary. (It might be useful for that regexp itself to be provided in core\, or for core to have a looks_like_hex() builtin... but that is for another day.)
* Chas. Owens \chas\.owens@​gmail\.com [2015-11-03 15:25]:
Here is my take. It includes all of the variant headers and examples of each (using a loop to save space).
Following the principle that they call ālots helps lotsā in Germany? :-)
I also stripped the statement that only integers can be represented (the lack of . character means you can't represent real numbers and the integer overflow warning should be enough to let the user know the backend type used to represent the number).
I considered that but decided to keep an explicit mention because Perl does now have hexadecimal floating point literals ā which hex() doesnāt support. Just 5 words extra anyway.
Other quibbles:
ā¢ You dropped the pointer that oct() is capable of parsing non-oct numbers based on prefix\, which I think makes a lot of sense to include here. (I chose to move it to an crossrefs para at the bottom.)
ā¢ Minor: you have L\
Regards\, -- Aristotle Pagaltzis // \<http://plasmasturm.org/>
* Ed Avis via RT \perlbug\-followup@​perl\.org [2015-11-03 19:15]:
'perlopentut' has lots of repeated boilerplate showing how to use die and $! even though in principle it would be enough to mention just once that open() returns false on failure and sets $!.
But perldoc -f open is much less of a tutorial and weāre not talking about perlhextut in this ticket. (Not to mention I would consider the function a disaster if we had to have such a document.)
If the regexp is included then that takes care of showing how to check that hex() will succeed\, so an example of trapping warnings from it is less necessary.
Iāve just pushed that patch as fc61cbf574ad7402fb8b5462426e43d38d0c9a1d.
Patch was applied by Aristotle; closing. -- James E Keenan (jkeenan@cpan.org)
@jkeenan - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#126437 (status was 'resolved')
Searchable as RT126437$