Open lmocsi opened 5 months ago
How do you propose nested capture groups are handled? Do you return a full list of all subgroups?
I'd suggest just using str.replace
for this, seems more appropriate.
How do you propose nested capture groups are handled? Do you return a full list of all subgroups?
I'd suggest just using
str.replace
for this, seems more appropriate.
@avimallu Can you give a modified code snippet of the above code for your suggestion?
Perhaps something like:
df.with_columns(
pl.col('a').str.extract_all(pattern).list.eval(pl.element().str.replace(pattern, '$1'))
)
# shape: (1, 1)
# ┌─────────────────────┐
# │ a │
# │ --- │
# │ list[str] │
# ╞═════════════════════╡
# │ [" name,", " car,"] │
# └─────────────────────┘
This actually does that, but I would not consider it simple:
df.with_columns(
pl.col('a').str.extract_all(pattern).list.eval(pl.element().str.replace(pattern, '$1').get(0))
)
#┌────────────┐
#│ a │
#│ --- │
#│ list[str] │
#╞════════════╡
#│ [" name,"] │
#└────────────┘
Yeah, it's not ideal - perhaps there is a simpler workaround.
Update: Seems like this behaviour has been also flagged as a bug:
extract_all_groups
is another alternative feature request:
Hi, is there any updates? Thanks!
Description
As of polars==0.20.30
Would be nice if I could tell
str.extract_all
to behave likestr.extract
and return only capture groups. (below: data1 and data2 should return the same)