Y-Wakuta / nosql_time-series_schema_designer

GNU General Public License v3.0
1 stars 0 forks source link

estimate CF creation cost during migration process based on the entry num of existing CF #50

Open Y-Wakuta opened 4 years ago

Y-Wakuta commented 4 years ago

Problem

The migration is executed by three steps, collecting data from existing CFs, projection, loading to new CF. In the result of projection, some duplicated results could be produced. Since this duplication is solved when we write the record on Cassandra, there is no operation problem. However, this is better to be considered in the cost estimation.

Solution

The CF creation cost in the migration process would be more accurate by using the size size = (existing CF entries) * (new CF record width). Once the records are written on Cassandra, since records duplication is removed, we should use the CF size which is calculated by size = (CF entries) * (CF record width) at another process.