Clarification of #9 - get_system_time() ... use statement start time rather than transaction start time?

arkhipov / temporal_tables

Temporal Tables PostgreSQL Extension

BSD 2-Clause "Simplified" License

927 stars 46 forks source link

Clarification of #9 - get_system_time() ... use statement start time rather than transaction start time? #13

Closed rbygrave closed 8 years ago

rbygrave commented 8 years ago

This is related to #9 that you have already answered.

For me I wonder if we can (or even should) use the 'statement start time' rather than the 'transaction start time' for get_system_time().

The thinking behind this is that at READ_COMMITTED isolation level the update statement at T4:

    UPDATE employees SET salary = 6800 WHERE name = 'Lenina Crowne';

... does effectively execute as at 'statement start time' (it sees the committed insert from Transaction B). To me I think there is an arguement that the time range for 'Lenina Crowne', 7000 can then be [T2,T4).

If the isolation level was SERIALISABLE then we should use the transaction start time but also Transaction A will not see the insert from Transaction B in that case. To me I think at READ_COMMITTED it should be [T2,T4) by using the 'statement start time' and then we don't need to use the time delta adjustment.

Any thoughts about that? Is the isolation level taken into account (I have not seen that in the code)?

Cheers, Rob.

arkhipov commented 8 years ago

That sounds reasonable to me. It is pretty easy to add one more SystemTimeMode (there are two of them now: CurrentTransactionStartTimestamp and UserDefined).

However, you cannot assume that T4 is always greater than T2. The clocks can go back during the time synchronization or it can occur due to clock skew between CPUs. So we still need to use the time adjustment.

I am not sure if the default SystemTimeMode should be chosen according to the current transaction isolation level (at least, it would break backward compatibility). It will probably be OK to let a user select the mode by changing a run-time parameter. Any other suggestions?

rbygrave commented 8 years ago

So we still need to use the time adjustment.

Yes ok.

add one more SystemTimeMode

Yes, agreed.

Perhaps the SystemTimeMode could be set via an extra optional argument to versioning(), a bit like period_attname. So that might mean ... this 'mode/flag' then gets passed as an argument to versioning_insert() etc and get_system_time().

Hmmm.

arkhipov commented 8 years ago

That makes impossible for a user to change the behavior for their sessions. Since you mentioned that the mode should depend on the current session's transaction isolation level, I thought that you wanted to set different modes in different sessions.

It is easy to implement any option, but I really do not know which one is better. To add insult to injury, even if the isolation level is SERIALIZABLE (I suppose you meant true serializability) and you use the CurrentTransactionStartTime mode, the order of transactions will not necessary be the same as their start times. So in terms of consistency, the time that is stored for historical data still remains incorrect.

rbygrave commented 8 years ago

That makes impossible for a user to change the behavior for their sessions.

Yes. For myself I'm 99% of the time using read_committed and yes I'd just have it fixed to 'current statement start time'.

I thought that you wanted to set different modes in different sessions.

For me no I don't think so. Right now, I'd be happy to have the mode fixed to use 'statement start time' but I honestly have not tested scenarios other than at read_committed.

the order of transactions will not necessary be the same as their start times

Yes. I'm trying to think of the downsides to just use the 'statement start time' all the time (regardless of isolation level etc). That is where my thinking has got to anyway - I've got distracted by other work etc.

Cheers, Rob.

arkhipov commented 8 years ago

OK. I will add the mode at the weekend. I need some time to think it all over.

The main reason why the extension uses the transaction start time is because the SQL:2011 standard defines it this way.

rbygrave commented 8 years ago

Coming back to this, I now think using 'statement start time' is not that good due to the case where a transaction contains multiple statements the history 'now appears' at different timestamps based on each statement execution time - this is at best confusing but generally could be described as un-workable. So I'd like to scratch this 'statement start time' idea completely and certainly I won't use it.

As I see it I think you should not implement this 'statement start time' option and close this issue.

Cheers, Rob.

arkhipov commented 8 years ago

It is good to know that you came to the same conclusion. I have tried different approaches, but I am still unsure which of them should be implemented. Fundamentally, the idea is probably not wrong, it just needs some elaboration.

I am closing the issue for now, but feel free to reopen it if needed.