confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
128 stars 1.04k forks source link

Add REGEXP_EXTRACT built-in function #878

Closed hjafarpour closed 4 years ago

hjafarpour commented 6 years ago

This built-in function will help extracting a string using the pattern.

big-andy-coates commented 6 years ago

Hi @hjafarpour - given that these are internal functions, not user-defined, it might be clearer to give them another name... KDFs! (Well, that's one option anyway).

apurvam commented 6 years ago

How is this different from #881 ? cc @rmoff

rmoff commented 6 years ago

I'd +1 not calling it a UDF

From my point of view both INSTR and REGEXP_EXTRACT are useful. The former is simpler and more accessible (and more familiar to more SQL users). The latter is more powerful but requires knowledge of regex, and for simple stuff is thus overkill.

hjafarpour commented 6 years ago

@big-andy-coates I changed it to build-in functions :)

bluemonk3y commented 6 years ago

Built-in would be the presumed standard. I.e Cassandra etc

On 13 Mar 2018 18:49, "Hojjat Jafarpour" notifications@github.com wrote:

We can call them build in functions :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/confluentinc/ksql/issues/878#issuecomment-372777554, or mute the thread https://github.com/notifications/unsubscribe-auth/AKC0pjAhgVNV6vV96kBaraJ1-iEI22ycks5teBSsgaJpZM4ShOhd .

blueedgenick commented 6 years ago

+1 for a regexp_extract function

stevenpyzhang commented 4 years ago

What exactly is the behavior we want for this function? I've been looking through other example of REGEXP_EXTRACT and there's slight variance in all their behaviors.

https://docs.data.world/documentation/sql/reference/functions/regexp_extract.html https://www.ibm.com/support/knowledgecenter/SS5FPD_1.0.0/com.ibm.ips.doc/postgresql/sqltk/r_sqlext_regexp_extract.html

cc @confluentinc/ksql

blueedgenick commented 4 years ago

As usual, the Presto team has a well thought out approach to this (family of) function: https://prestodb.io/docs/current/functions/regexp.html. I'd look there for inspiration.

On Thu, Mar 5, 2020, 1:02 PM Steven Zhang notifications@github.com wrote:

What exactly is the behavior we want for this function? I've been looking through other example of REGEXP_EXTRACT and there's slight variance in all their behaviors.

https://docs.data.world/documentation/sql/reference/functions/regexp_extract.html

https://www.ibm.com/support/knowledgecenter/SS5FPD_1.0.0/com.ibm.ips.doc/postgresql/sqltk/r_sqlext_regexp_extract.html

cc @confluentinc/ksql https://github.com/orgs/confluentinc/teams/ksql

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/confluentinc/ksql/issues/878?email_source=notifications&email_token=ABCXJID3KX7TIS4MVIFI2OLRGAHM5A5CNFSM4EUE5BO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN64JNQ#issuecomment-595444918, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCXJIC6XVKYHYAPPKF5KF3RGAHM5ANCNFSM4EUE5BOQ .

vcrfxia commented 4 years ago

This was implemented in https://github.com/confluentinc/ksql/pull/4728 and will be released with ksqlDB 0.8.0