amphi-ai / amphi-etl

Python-based Low-code ETL for data manipulation and transformation. Generates Python code you can deploy anywhere.
https://amphi.ai
Other
750 stars 33 forks source link

Filter failed after renaming field with UTF8 #63

Open simonaubertbd opened 1 month ago

simonaubertbd commented 1 month ago

Hello,

Strange issue here where I have renamed a field with UTF8 but I'm then unable to exploit it in a filter

image

image

Detailed error `Error

Could not convert 'BACKTICK_QUOTEDSTRING🐈🐈🐈🐈🐈🐈' to a valid Python identifier. () Show Traceback

Traceback (most recent call last):

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\IPython\core\interactiveshell.py:3577 in run_code exec(code_obj, self.user_global_ns, self.user_ns)

Cell In[16], line 14 filter1 = rename1.query("🐈🐈🐈🐈🐈🐈 == ff")

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\frame.py:4823 in query res = self.eval(expr, **kwargs)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\frame.py:4949 in eval return _eval(expr, inplace=inplace, **kwargs)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\eval.py:336 in eval parsed_expr = Expr(expr, engine=engine, parser=parser, env=env)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\expr.py:809 in init self.terms = self.parse()

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\expr.py:828 in parse return self._visitor.visit(self.expr)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\expr.py:402 in visit clean = self.preparser(node)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\expr.py:166 in _preparse return tokenize.untokenize(f(x) for x in tokenize_string(source))

File ~\AppData\Local\Programs\Python\Python312\Lib\tokenize.py:296 in untokenize out = ut.untokenize(iterable)

File ~\AppData\Local\Programs\Python\Python312\Lib\tokenize.py:192 in untokenize for t in it:

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\expr.py:166 in return tokenize.untokenize(f(x) for x in tokenize_string(source))

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\expr.py:124 in return lambda *args, *kwargs: f(g(args, **kwargs))

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\parsing.py:95 in clean_backtick_quoted_toks return tokenize.NAME, create_valid_python_identifier(tokval)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\computation\parsing.py:68 in create_valid_python_identifier raise SyntaxError(f"Could not convert '{name}' to a valid Python identifier.")

File SyntaxError: Could not convert 'BACKTICK_QUOTEDSTRING🐈🐈🐈🐈🐈🐈' to a valid Python identifier.`

simonaubertbd commented 1 month ago

Ok, I understand why, only some utf8 characters seems supported : https://www.asmeurer.com/python-unicode-variable-names/#unicode-character-list

maybe having a warning in the rename tool ?

tgourdel commented 1 month ago

I'm surprised by this error too, I'll take a look