fboulnois / pg_uuidv7

A tiny Postgres extension to create version 7 UUIDs
Mozilla Public License 2.0
298 stars 25 forks source link

Timestamp functions volatility #26

Closed gulitsky closed 6 months ago

gulitsky commented 7 months ago

uuid_v7_to_timestamptz and uuid_timestamptz_to_v7 should be declared as IMMUTABLE. This is because the functions' results depend solely on their input arguments, and they will always return the same result for the same input. This change also allows you to use these functions in generated columns.

created_at TIMESTAMP GENERATED ALWAYS AS (uuid_v7_to_timestamptz(id)) STORED
fboulnois commented 7 months ago

I don't think the functions can be set to IMMUTABLE, although it may be possible to switch to STABLE instead. The functions manipulate timestamptzs which are timestamp with time zones. Per the Postgres documentation:

A common error is to label a function IMMUTABLE when its results depend on a configuration parameter. For example, a function that manipulates timestamps might well have results that depend on the TimeZone setting. For safety, such functions should be labeled STABLE instead.

Additionally, this change modifies the function signature which I would consider a breaking change (you're breaking the API contract with the user, even if there is no change in the output). Therefore, I wouldn't backport it to previous versions. Instead I'd cut a new version 1.5.0 with the changes.

gulitsky commented 7 months ago

Yeah, I think you're right with the STABLE volatility. Anyway, if someone has the same requirement as I do, they can modify the volatility of the functions using the following code right now

ALTER FUNCTION uuid_v7_to_timestamptz STABLE;
ALTER FUNCTION uuid_timestamptz_to_v7 STABLE;
fboulnois commented 6 months ago

Fixed in 1.5.0.

mkindahl commented 4 months ago

I don't think the functions can be set to IMMUTABLE, although it may be possible to switch to STABLE instead. The functions manipulate timestamptzs which are timestamp with time zones. Per the Postgres documentation:

A common error is to label a function IMMUTABLE when its results depend on a configuration parameter. For example, a function that manipulates timestamps might well have results that depend on the TimeZone setting. For safety, such functions should be labeled STABLE instead.

Additionally, this change modifies the function signature which I would consider a breaking change (you're breaking the API contract with the user, even if there is no change in the output). Therefore, I wouldn't backport it to previous versions. Instead I'd cut a new version 1.5.0 with the changes.

I see that this issue is closed, but I think that these functions can be IMMUTABLE since they return the same result for the same input for eternity, which is the requirement for immutable functions.

I think the comment quoted above is more focused on functions that are dependent on the timezone as part of the implementation and would return different results dependent on the timezone (an example: a function that takes a timestamp without a timezone and uses the current timezone to compute a timestamp with a timezone). These functions do not.

jdknezek commented 4 months ago

I see that this issue is closed, but I think that these functions can be IMMUTABLE since they return the same result for the same input for eternity, which is the requirement for immutable functions.

I think the comment quoted above is more focused on functions that are dependent on the timezone as part of the implementation and would return different results dependent on the timezone (an example: a function that takes a timestamp without a timezone and uses the current timezone to compute a timestamp with a timezone). These functions do not.

This reasoning makes sense to me - especially given the fact that Postgres considers two timestamptzs that only differ by time zone to be equal, and the time zone is the only thing that could differ between two invocations of uuid_v7_to_timestamptz.

I am less concerned about uuid_timestamptz_to_v7 (which isn't really even STABLE unless zero => true), but if uuid_v7_to_timestamptz were IMMUTABLE it would have positive implications for time series data.

mkindahl commented 4 months ago

I see that this issue is closed, but I think that these functions can be IMMUTABLE since they return the same result for the same input for eternity, which is the requirement for immutable functions. I think the comment quoted above is more focused on functions that are dependent on the timezone as part of the implementation and would return different results dependent on the timezone (an example: a function that takes a timestamp without a timezone and uses the current timezone to compute a timestamp with a timezone). These functions do not.

This reasoning makes sense to me - especially given the fact that Postgres considers two timestamptzs that only differ by time zone to be equal, and the time zone is the only thing that could differ between two invocations of uuid_v7_to_timestamptz.

I am less concerned about uuid_timestamptz_to_v7 (which isn't really even STABLE unless zero => true), but if uuid_v7_to_timestamptz were IMMUTABLE it would have positive implications for time series data.

It is a good point that uuid_timestamptz_to_v7 does not return the same result when called twice if zero == true but given that it is a random value, and there can be no reliance on any specific random value being returned, I think this can be treated as IMMUTABLE.

I think it is mostly a matter of taste if you want different invocations of uuid_timestamptz_to_v7 to return different values or if you're fine an invocation of the function with the same timestamptz return the same value.

If you want to provide both alternatives, you can have two different functions for this, like in pull request #33.

fboulnois commented 3 months ago

Before further investing time in changing the current code / adding additional variations of the uuidv7 functions, I am looking to answer this question:

A common error is to label a function IMMUTABLE when its results depend on a configuration parameter. For example, a function that manipulates timestamps might well have results that depend on the TimeZone setting. For safety, such functions should be labeled STABLE instead.

Going back to this line in the Postgres docs, what happens if the timezone is changed while the database is running? In that case, do functions that have timezone-related functionality that are marked as IMMUTABLE break?

mkindahl commented 3 months ago

Before further investing time in changing the current code / adding additional variations of the uuidv7 functions, I am looking to answer this question:

A common error is to label a function IMMUTABLE when its results depend on a configuration parameter. For example, a function that manipulates timestamps might well have results that depend on the TimeZone setting. For safety, such functions should be labeled STABLE instead.

Going back to this line in the Postgres docs, what happens if the timezone is changed while the database is running? In that case, do functions that have timezone-related functionality that are marked as IMMUTABLE break?

I understand your concern to ensure that the change does not cause strange behaviors, but the question is a little unclear, perhaps because it is a little too generic.

In a sense, if a function has "timezone-related functionality" it cannot be marked IMMUTABLE, so it all boils down to what meaning you put in the phrase "timezone-related functionality".

If a function is marked as IMMUTABLE it will (and should) return the "same" result after a timestamp change (I use "same" within quotes because of the discussion above about the random values-part of the UUID) and if the computation depends on the current timezone, it cannot be marked IMMUTABLE because it will return a different result before and after changing the timezone. For this reason, I would argue that these two meanings are the same.

For the two functions under discussion here (uuid_v7_to_timestamptz and timestamptz_to_uuid_v7) they would return the "same" value before and after a timezone change regardless of when that happens. When translated to text, the texts will not compare equal but they represent the same point in time and compare equal when compared as timestamps with timezone. In that sense they do not have "timezone-related functionality".

For the hypothetical function discussed above that use the timezone as part of computing the result, it would not be correct to make it IMMUTABLE since it is STABLE and will return different results before and after changing the timezone. In that sense, it has "timezone-related functionality" since the timezone is part of the computation.