felix-reichel / price-search-engine-seals-analysis

Produces a price search engine firm quality seal changes data set of (potentially) skewed index-spaced data cubes within a big data cube.
0 stars 0 forks source link

Impl dynamic inflow sample data/offer memory db loader #12

Open felix-reichel opened 2 weeks ago

felix-reichel commented 2 weeks ago

New:

Largest week offers data, that is ~ 760mb

(52+26)*760 ~ 60gb <= 128gb (S)

L (512gb),XL (1tb) , 2XL (2tb)

Goal: Batch process 5yrs needs 1yr (52 pre-weeks) overlap then, for 4 iterations covering 20-4 years then. (2023-2007)=16. 4 batches. XL should be feasible for offers and clicks only.

felix-reichel commented 2 weeks ago

Related: Important current issue:

https://github.com/duckdb/duckdb/issues/14087

https://github.com/duckdb/duckdb/pull/12318#issuecomment-2374523222

https://github.com/duckdb/duckdb/issues/12286#issue-2320913128

felix-reichel commented 3 days ago

New:

Largest week offers data, that is ~ 760mb

(52+26)*760 ~ 60gb <= 128gb (S)

L (512gb),XL (1tb) , 2XL (2tb)

Goal: Batch process 5yrs needs 1yr (52 pre-weeks) overlap then, for 4 iterations covering 20-4 years then. (2023-2007)=16. 4 batches. XL should be feasible for offers and clicks only.