malloydata / malloy

Malloy is an experimental language for describing data relationships and transformations.
http://www.malloydata.dev
MIT License
1.99k stars 76 forks source link

Case insensitivity. #727

Open lloydtabb opened 2 years ago

lloydtabb commented 2 years ago

We should support case insensitivity.

There is no standard insensitivity in REGEXP_MATCHES, which is a problem.

We can make ILIKE work for now.

We need to figure out how to control this in the language. Maybe a property on a source.

lloydtabb commented 2 years ago

In DuckDB

https://github.com/duckdb/duckdb/blob/master/test/sql/function/string/regex_search.test#L106

lloydtabb commented 2 years ago

In BigQuery, prepend regexp with (?i)

sql: one is ||
  SELECT 'apple' as a, 'banna' as b
  UNION ALL SELECT 'ARTICHOKE', 'BRUSSELSPROUT'
;; -- on "duckdb"

query: from_sql(one) -> {
  project:
    a
    matches is a ? ~ r'A'
    matches_insensitve is a ? ~ r'(?i)A'
    matches_upper is UPPER(a) ~ r'A' 
}
mtoy-googly-moogly commented 2 years ago

Hmm ...

1) At some point we will put regex in the grammar, and have regex syntax be / with trailing modifiers and generate the correct output for each dialect 2) We could do ~~ for case insensitive LIKE (and also regex if we do ^^^)