Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Extend the pandas.to_numeric function to support the conversion of strings representing hexadecimal, octal, and binary numbers when they start with the corresponding prefixes (0x, 0o, 0b).
s = pd.Series(["1.0", "2", -3, "0x32"])
pd.to_numeric(s) # , errors="coerce")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File lib.pyx:2391, in pandas._libs.lib.maybe_convert_numeric()
ValueError: Unable to parse string "0x32"
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[39], [line 2](vscode-notebook-cell:?execution_count=39&line=2)
[1](vscode-notebook-cell:?execution_count=39&line=1) s = pd.Series(["1.0", "2", -3, "0x32"])
----> [2](vscode-notebook-cell:?execution_count=39&line=2) pd.to_numeric(s) # , errors="coerce")
File oSDH5rfs-py3.11\Lib\site-packages\pandas\core\tools\numeric.py:232, in to_numeric(arg, errors, downcast, dtype_backend)
[230](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:230) coerce_numeric = errors not in ("ignore", "raise")
[231](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:231) try:
--> [232](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:232) values, new_mask = lib.maybe_convert_numeric( # type: ignore[call-overload]
[233](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:233) values,
[234](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:234) set(),
[235](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:235) coerce_numeric=coerce_numeric,
[236](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:236) convert_to_masked_nullable=dtype_backend is not lib.no_default
[237](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:237) or isinstance(values_dtype, StringDtype)
[238](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:238) and not values_dtype.storage == "pyarrow_numpy",
[239](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:239) )
[240](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:240) except (ValueError, TypeError):
[241](file:///oSDH5rfs-py3.11/Lib/site-packages/pandas/core/tools/numeric.py:241) if errors == "raise":
File lib.pyx:2433, in pandas._libs.lib.maybe_convert_numeric()
ValueError: Unable to parse string "0x32" at position 3
Feature Description
pandas.to_numeric is a versatile function for converting various data types to numeric values. However, it currently does not support the direct conversion of strings representing numbers in different bases (hexadecimal, octal, and binary) that use standard prefixes. Adding this feature would enhance the function's utility and align it with the conversion capabilities found in core Python functions and PEP standards.
Modify the pandas.to_numeric function to detect strings starting with 0x (hexadecimal), 0o (octal), and 0b (binary) and convert them to their corresponding integer values.
Hexadecimal Strings:
Prefix: 0x or 0X
Example: 0x1A should convert to 26.
Octal Strings:
Prefix: 0o or 0O
Example: 0o32 should convert to 26.
Binary Strings:
Prefix: 0b or 0B
Example: 0b11010 should convert to 26.
Alternative Solutions
import pandas as pd
def extended_to_numeric(series):
def convert_value(value):
if isinstance(value, str):
if value.startswith(('0x', '0X')):
return int(value, 16)
elif value.startswith(('0o', '0O')):
return int(value, 8)
elif value.startswith(('0b', '0B')):
return int(value, 2)
return pd.to_numeric(value, errors='coerce')
return series.apply(convert_value)
# Example usage
data = pd.Series(['0x1A', '0o32', '0b11010', '42', 'invalid'])
numeric_data = extended_to_numeric(data)
print(numeric_data)
Feature Type
[ ] Adding new functionality to pandas
[X] Changing existing functionality in pandas
[ ] Removing existing functionality in pandas
Problem Description
Extend the pandas.to_numeric function to support the conversion of strings representing hexadecimal, octal, and binary numbers when they start with the corresponding prefixes (0x, 0o, 0b).
Feature Description
pandas.to_numeric is a versatile function for converting various data types to numeric values. However, it currently does not support the direct conversion of strings representing numbers in different bases (hexadecimal, octal, and binary) that use standard prefixes. Adding this feature would enhance the function's utility and align it with the conversion capabilities found in core Python functions and PEP standards.
Modify the pandas.to_numeric function to detect strings starting with 0x (hexadecimal), 0o (octal), and 0b (binary) and convert them to their corresponding integer values.
0x
or0X
0x1A
should convert to26
.0o
or0O
0o32
should convert to26
.0b
or0B
0b11010
should convert to26
.Alternative Solutions
Additional Context
No response