tcdi / plrust

A Rust procedural language handler for PostgreSQL
PostgreSQL License
1.12k stars 32 forks source link

should not use `str::len` to get the "character length" of a string in README #285

Closed BugenZhao closed 1 year ago

BugenZhao commented 1 year ago

Hi, PL/Rust team.

I found the example shown in README might be misleading. According to the documentation, str::len returns the length of the underlying bytes buffer and behaves incorrectly for non-ASCII characters when used to count the characters in the string.

https://doc.rust-lang.org/stable/std/primitive.str.html#method.len Returns the length of self.

This length is in bytes, not chars or graphemes. In other words, it might not be what a human considers the length of the string.

Instead, one may use s.chars().count() here.

thomcc commented 1 year ago

I mean, it's just a small demo and not really intended to be an example of how best to count the length of a string (and note that s.chars().count() is still wrong for the number of characters by most metrics (unicode scalars != characters for really any reasonable definition of character, see https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-unicode-code-points/).

But I suppose we should consider changing the example to something less likely to spark pedantry.

gurjeet commented 1 year ago

Postgres and Rust, both, have a reputation for correctness. So it'd be best if the example in the README is either fixed to not elicit correctness debates, or possibly be replaced with a different example.

In my opinion, this example, if retained, should be corrected to produce a result identical to Postgres' length(text) or char_length(text) functions (at least when Postgres server uses UTF-8 encoding).

https://www.postgresql.org/docs/15/functions-string.html

eeeebbbbrrrr commented 1 year ago

I've updated the examples and merged to main:

CREATE FUNCTION add_two_numbers(a NUMERIC, b NUMERIC) RETURNS NUMERIC STRICT LANGUAGE plrust AS $$ 
    Ok(Some(a + b))
$$;
CREATE FUNCTION

SELECT add_two_numbers(2, 2);
 add_two_numbers 
-----------------
               5
(1 row)

I hope everyone likes this.