cmu-db / peloton

The Self-Driving Database Management System
http://pelotondb.io
Apache License 2.0
2.03k stars 624 forks source link

Read-Only TxnContext Interface; Read-Only (single-statement select, txn) optimizations #1402

Open lmwnshn opened 6 years ago

lmwnshn commented 6 years ago

1396

  1. Removed default read_only=false from TransactionManager::BeginTransaction
  2. Removed TransactionContext::SetReadOnly
  3. Added read-only param to TransactionContext constructors
  4. Transactions with no modifying queries treated as read-only 4.1 assumption: if transaction is marked read-only, then READ transactions are not added to ReadWriteSet

~~#1395 Single-statement selects are marked as read-only~~

Unable to exercise this code path, possibly due to #1441

lmwnshn commented 6 years ago

Currently, I am only failing copy_test (1/181).

If I add traffic_cop.SetStatement(statement) to copy_test here, I will pass. The question is, should I do that?

I'm looking into what else is using SetStatement right now.

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.04%) to 77.125% when pulling ec4c796f61c0cb2cee13764845c8f8d6fdfe186f on lmwnshn:txnctx/readonly into a045cfc95bf349742a8101aee65e22efd9ec8096 on cmu-db:master.

mbutrovich commented 6 years ago

The more I look at this, the more it seems like we're building a hack on top of an unstable foundation -- at least part 4 of @lmwnshn's description. We're trying to find a shortcut if the RWSet only contains Reads/is empty. However, we shouldn't even be putting Reads in the RWSet in the first place. @yingjunwu says as much. The only reason we even add READ_OWNs is for releasing the write lock at Commit/Abort.

Commit and Abort do nothing if an element is READ when they iterate through the RWSet, so they shouldn't really be there in the first place.

This PR looks fine on its own, but it seems like a workaround for what a larger design issue that should be addressed.

yingjunwu commented 6 years ago

I am not sure what you guys are currently working on, but I wanna mention that the original rwset was design to be general enough -- We didn't assume the underlying CC protocol, and the rwset was supposed to be compatible with any protocol, like OCC and 2PL. While our experiments show that timestamp ordering performs best, it seems that some people (e.g., Phil Bernstein) have a strong faith in 2PL :-)

mbutrovich commented 6 years ago

So that’s sort of emblematic of the issues I feel like I’m finding in the current system. Things like maitaining a doubly-linked version chain, reads getting added to the RWSet, etc. seem to be preventing us from making the best MVTO system possible.

I definitely see why it was necessary in the context of comparing CC systems, but if MVTO is what we’re sticking with I think we can improve it.

apavlo commented 6 years ago

@mbutrovich Do we want to merge this PR?

mbutrovich commented 6 years ago

@apavlo Yes. @lmwnshn might need to do a rebase first, but I still think it should be merged.

lmwnshn commented 6 years ago

@mbutrovich Undid the formatting changes

mbutrovich commented 6 years ago

I can't get your change for #1395 to work. I put a breakpoint here and then ran the following in psql:

default_database=# CREATE TABLE foo(a int, b int);
default_database=# SELECT * FROM foo;
 a | b 
---+---
(0 rows)

default_database=# insert into foo values(1,2),(3,4);
default_database=# SELECT * FROM foo;
 a | b 
---+---
 1 | 2
 3 | 4
(2 rows)

The breakpoint doesn't trigger with the second SELECT. Am I misunderstanding the query I should be using?