duckdb / duckdb-web

DuckDB website and documentation
https://duckdb.org
MIT License
168 stars 319 forks source link

Docs for `regexp_replace` do not mention that it applies to first occurrence only #2854

Open david-cortes opened 5 months ago

david-cortes commented 5 months ago

Docs for function regexp_replace mention:

If string contains the regexp pattern, replaces the matching part with replacement (see Pattern Matching).

But what this function does by default is only to replace the first occurrence of the pattern, which is different from what other regex engines do which is to replace all of them.

See for example R's gsub in perl mode:

gsub("a", "", "baab", perl=TRUE)
"bb"

DuckDB in contrast would output "bab", which is not what one would expect from what the docs describe.

szarnyasg commented 5 months ago

Hello, thanks for reporting this.

The reason behind this is by default DuckDB/RE2 assume the g (global replace) option to be false.

The option is given here: https://duckdb.org/docs/sql/functions/regular_expressions#options-for-regular-expression-functions, but the documentation should be clearer about both the default value of this and also the regexp_replace function should be better documented.