cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.6k stars 3.71k forks source link

sql: optimize trigram generator #86610

Open mgartner opened 1 year ago

mgartner commented 1 year ago

The latency difference between our show_trgm builtin and Postgres's suggestions that there may be opportunities to optimize our trigram generation code. On my machine, the query select show_trgm('hello world') from generate_series(1, 10000); takes ~73ms in CRDB and ~14ms in PG.

Jira issue: CRDB-18836

jordanlewis commented 1 year ago

Hm I wonder what this is about. Here's what the code does:

  1. find word boundaries
  2. copy each word into a new allocated buffer (maybe this is too wasteful?) if running with padding, which show_trgm would be
  3. make slices for all the trigrams
  4. sort the trigrams
  5. distinct the trigrams
github-actions[bot] commented 5 months ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!