amphi-ai / amphi-etl

Visual Data Transformation with Python Code Generation. Low-Code Python-based ETL.
https://amphi.ai
Other
904 stars 44 forks source link

Parse & Extract : unable to use a custom regex #193

Open simonaubertbd opened 2 hours ago

simonaubertbd commented 2 hours ago

Hello,

I tried to use a custom regex by adding it : image

Result : image

Error

pattern contains no capture groups Show Traceback


ValueError Traceback (most recent call last) Cell In[9], line 20 16 inlineInput1 = pd.read_csv(StringIO(inlineInput1_data)).convert_dtypes() 19 # Extract data using regex ---> 20 extract2_extracted = inlineInput1['LastName'].str.extract(r"^[a-zA-Z]+$") 21 extract2_extracted.columns = [] 22 extract2 = inlineInput1.join(extract2_extracted, rsuffix="_extracted")

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\strings\accessor.py:137, in forbid_nonstring_types.._forbid_nonstring_types..wrapper(self, *args, *kwargs) 132 msg = ( 133 f"Cannot use .str.{func_name} with values of " 134 f"inferred dtype '{self._inferred_dtype}'." 135 ) 136 raise TypeError(msg) --> 137 return func(self, args, **kwargs)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\strings\accessor.py:2740, in StringMethods.extract(self, pat, flags, expand) 2738 regex = re.compile(pat, flags=flags) 2739 if regex.groups == 0: -> 2740 raise ValueError("pattern contains no capture groups") 2742 if not expand and regex.groups > 1 and isinstance(self._data, ABCIndex): 2743 raise ValueError("only one regex group is supported with Index")

ValueError: pattern contains no capture groups

Not sure I use it right (maybe a documentation issue?)

Best regards,

Simon

tgourdel commented 2 hours ago

Hi Simon, yes indeed in Python, regex expects the "caputre group" which is the parentheses to let the program know where to catch the value or values, in your case: ^([a-zA-Z]+)$

definitely a documentation issue!

simonaubertbd commented 2 hours ago

@tgourdel And I'm definitely not a Python dev ! ;)