apavlo / h-store

H-Store Distributed Main Memory OLTP Database System
https://hstore.cs.brown.edu
GNU General Public License v3.0
568 stars 174 forks source link

tableName in AntiCacheDB #174

Closed mjgiardino closed 10 years ago

mjgiardino commented 10 years ago

As of now, NVMAntiCacheDB doesn't store the tableName of the block, but BerkeleyAntiCacheDB does. This causes testing failures if you attempt to compare getTableName() of two blocks. I think we should decide on one way or the other for consistency of implementation of the parent class AntiCacheDB. I'm leaning towards removing it because it doesn't seem to be used outside of testing.

Atreyee commented 10 years ago

The table name is used when merging unevicted tuples i.e when tuples in an anti-cache block on disk are merged back into memory as seen here. You are right about the fact that the behavior should be consistent. But the table name is not used just for testing

mjgiardino commented 10 years ago

That seems to be getting an actual pointer to a table from the executor context (m_executor). I'm talking about m_tableName and getTableName() function from AntiCacheDB:AntiCacheBlock seen here. Grepping the code seems to show no other instances of getTableName from an AntiCacheBlock.

Atreyee commented 10 years ago

The table name is used when merging unevicted tuples i.e when tuples in an anti-cache block on disk are merged back into memory as seen here https://github.com/apavlo/h-store/blob/master/src/ee/anticache/AntiCacheEvictionManager.cpp#L1241. You are right about the fact that the behavior should be consistent. But the table name is not used just for testing.

On Fri, Sep 5, 2014 at 2:13 PM, Michael Giardino notifications@github.com wrote:

As of now, NVMAntiCacheDB https://github.com/apavlo/h-store/blob/master/src/ee/anticache/NVMAntiCacheDB.cpp#L152 doesn't store the tableName of the block, but BerkeleyAntiCacheDB https://github.com/apavlo/h-store/blob/master/src/ee/anticache/BerkeleyAntiCacheDB.cpp#L126 does. This causes testing failures if you attempt to compare getTableName() of two blocks. I think we should decide on one way or the other for consistency of implementation of the parent class AntiCacheDB. I'm leaning towards removing it because it doesn't seem to be used outside of testing.

— Reply to this email directly or view it on GitHub https://github.com/apavlo/h-store/issues/174.

mjgiardino commented 10 years ago

I can add the information to the NVM blocks.

That being said, I'm still a little unclear why we need the table name. As far as I can tell from the code you're talking about the process is as follows:

1.) pass a PersistentTable to mergeUnevictedTuples 2.) for each block in the table (table->getUnevictedBlocks(i)), serialize the data. 3.) From each serialized block, it pulls the number of tuples and table name from the block for use in looking up the table later (where you cited). 4.) It then merges each tuple into the table.

However, since all these blocks are coming from table->getUnevictedBlocks, and table has it's own name stored (which is in fact used for all the debugging information 1, 2), I'm not clear why we have to look up a table each time.

Now if a single PersistentTable instance can have blocks for a different table, I can see why we would look up the table for each new block. However, I am not sure why a table would have a different table's block.

In any case, it's no trouble for me to add the tableName to NVMAntiCacheBlock, I was just trying to prune out what seemed to be duplicate functionality.

mjgiardino commented 10 years ago

Added tableName to NVMAntiCacheBlock in my fork.