felix-reichel / price-search-engine-seals-analysis

Produces a price search engine firm quality seal changes data set of (potentially) skewed index-spaced data cubes within a big data cube.
0 stars 0 forks source link

Impl Business Layer Abstraction 2.0 #29

Open felix-reichel opened 1 month ago

felix-reichel commented 1 month ago

Variable Domain Processing

Workflow:

  1. Variable Domain?

    • (allDBDataForVariableDomainIsLoaded?)
      • processInWeekBatches(variableDomain)
      • processInMonthlyBatches(variableDomain)
  2. Steps to Execute:

    • (a) Determine Domain
    • (b) Determine Render Strategy
    • (c) Call Loaders
    • (d) Compute From Below
    • (e) Store To Grid (or on error: rollback)

CASE UNREALISTIC:

GIVEN

  1. The fully constructed (max. covering) (i, j, t)-results_space exists,
  2. AND all data is currently loaded into the DB.

For Every Possible args/(i, j, t) Combination Perform the Following:


Pseudo-Code Routine: Part 1

_Processing for a single args/(i, j, t)-variant variable in a fully constructed (i, j, t)-results_space:_


Variable_name_arg = "clicks_"

// Check if variable exists
// variable_does_even_exist_lookup()

variant_result = 'ijt'  # args, var variancy result; needed for WHERE-Clause

# Perform a check
check(*args, variant_result):
    i = args[0]
    i is Not Null & j is not null & t is not null?

# Calculate Unix Time Bounds
u_lower_bound_from_t = u_lower(t)
u_upper_bound_from_t = u_upper(t)

// Collect exclusion criteria
// buildQueryWith (Criterion(), get ijt_WHERE_CLAUSE, ….)

# Example query (q1) result
q1 :=
SELECT SUM(c.clicks), 
       c.produkt_id, 
       c.haendler_bez, 
       DB_FUNC_GET_T_FROM_U(c.timestamp, UNIX_ORIGIN) as db_t
FROM clicks c
WHERE 
    c.produkt_id = i         # get ijt_WHERE_CLAUSE
    AND c.haendler_bez = j 
    AND c.timestamp BETWEEN u_lower_bound_from_t AND u_upper_bound_from_t
    AND NOT EXISTS (
        SELECT 1 
        FROM scrapper_ips si 
        WHERE c.user_ip = si.user_ip
    )  # Append EXCLUSION CRITERION
GROUP BY i, j, db_t;

// Retrieve result
getAsPl(q1)

// Variable value realization found
var_result_value = SUM(c.clicks)  # Can be any SQL aggregate function (MIN, MAX, AVG, SUM)

// Perform a sanity check
sanity_check():
    db_t == t ?
    -26 <= t <= 832 ?   
    -26 <= db_t <= 832 ?