opensource-observer / oso

Measuring the impact of open source software
https://opensource.observer
Apache License 2.0
71 stars 16 forks source link

dbt: final clean up of int models feeding code & onchain metrics marts #1440

Closed ccerv1 closed 4 months ago

ccerv1 commented 4 months ago

What is it?

Each metric feeding the consolidated project metrics marts should be available as a standalone query, ie,

Then int_code_metrics_by_project should import all relevant queries in one big union statement and join on project data

The final code_metrics_by_project_v1 should simply import int_code_metrics_by_project

ccerv1 commented 4 months ago

The models should be good to go, however, I haven't been able to fully simulate everything in oso_playground. Copying a thread from Discord:

Something weird has been happening over the weekend. My attempts at dbt run have been hanging and I need to cancel them. The weird thing is that if I try to run the last model that was attempted before the hang, it goes through quickly without any issue. And then if I try dbt run again it hangs somewhere else in the sequence. I'm curious if you've encountered this or if there are any changes that could explain why this might be happening.

For example, right now it is hanging here:

40 of 122 OK created sql incremental model oso_playground.int_optimism_traces .. [SCRIPT (552.6 MiB processed) in 342.38s]

If I check the log, it says:

09:16:18.702344 [info ] [Thread-32 ]: 40 of 122 OK created sql incremental model oso_playground.int_optimism_traces .. [SCRIPT (552.6 MiB processed) in 342.38s]
09:16:18.703801 [debug] [Thread-32 ]: Finished running node model.opensource_observer.int_optimism_traces

And if I check the job in the BQ console, I see:

9:16 AM [11:46]    
SELECT STRUCT<STRUCT<ARRAY<TIMESTAMP>>>((
          select as struct
              -- IGNORE NULLS: this needs to be aligned to _dbt_max_partition, which ignores null
              array_agg(distinct timestamp_trunc(block_timestamp, day) IGNORE NULLS)
          from `opensource-observer`.`oso_playground`.`int_optimism_traces__dbt_tmp`
      )).*;
3    2.49 MB        
9:16 AM [21:5]    
merge into `opensource-observer`.`oso_playground`.`int_optimism_traces` as DBT_INTERNAL_DEST
        using (
        select
        * from `opensource-observer`.`oso_playground`.`int_optimism_traces__dbt_tmp`
      ) as DBT_INTERNAL_SOURCE
        on FALSE

    when not matched by source
         and timestamp_trunc(DBT_INTERNAL_DEST.block_timestamp, day) in unnest(dbt_partitions_for_replacement) 
        then delete

    when not matched then insert
        (`block_timestamp`, `id`, `block_number`, `block_hash`, `transaction_hash`, `transaction_index`, `from_address`, `to_address`, `value`, `input`, `output`, `trace_type`, `call_type`, `reward_type`, `gas`, `gas_used`, `subtraces`, `trace_address`, `error`, `status`, `trace_id`, `ingestion_time`, `address`)
    values
        (`block_timestamp`, `id`, `block_number`, `block_hash`, `transaction_hash`, `transaction_index`, `from_address`, `to_address`, `value`, `input`, `output`, `trace_type`, `call_type`, `reward_type`, `gas`, `gas_used`, `subtraces`, `trace_address`, `error`, `status`, `trace_id`, `ingestion_time`, `address`)
4    550.1 MB        
9:16 AM [40:7]    
drop table if exists `opensource-observer`.`oso_playground`.`int_optimism_traces__dbt_tmp`

(and all three queries succeeding)

ccerv1 commented 4 months ago

Also need to resolve this first: https://github.com/opensource-observer/oso/issues/1442