I was working on a personal project 5 months ago which had dbt, and this package, everything was working well.
Now as I was trying to run the project again, it is not going as well.
test_dbt__bi__yellow_m2m and dbt__bi__yellow_m2m_expected being seeds that have same configuration and are in the same directory and are also loaded to the same bigquery dataset.
the 2 seeds originate from 2 csv files with the same names.
Expected results
A big query job combining the 2 seeds and testing the results of apply a model to one of the input and comparing it to the output.
Actual results
the tool uses only 1 of the two table, the expected_output table, and attempts to create the other table.
/* {"app": "dbt", "dbt_version": "1.5.2", "profile_name": "default", "target_name": "dev", "node_id": "test.nytaxi.dbt_datamocktool_unit_test_dbt__bi__yellow_m2m_ref_seed_zones___ref_dbt__bi__yellow_m2m_expected___ref_test_dbt__bi__yellow_m2m_.ac26a1c36a"} */
with a as (
select
`vendors`
,
`year`
,
`month`
,
`monthly_fare_amount`
,
`average_passenger_count`
,
`average_trip_distance`
,
`monthly_extra_amount`
,
`monthly_mta_tax`
,
`monthly_tip_amount`
,
`monthly_tolls_amount`
,
`monthly_improvement_surcharge`
,
`monthly_total_amount`
,
`monthly_congestion_surcharge`
,
`pickup_borough`
,
`pickup_zone`
,
`dropoff_borough`
,
`dropoff_zone`
,
`avg_duration_minutes`
,
`trip_count`
,
`total_duration_hours`
,
`busiest_hour_of_day`
,
`airport_trip`
,
`payment_type`
from `stellarismusv5`.`models_ut`.`dbt__bi__yellow_m2m_expected`
),
b as (
select
`vendors`
,
`year`
,
`month`
,
`monthly_fare_amount`
,
`average_passenger_count`
,
`average_trip_distance`
,
`monthly_extra_amount`
,
`monthly_mta_tax`
,
`monthly_tip_amount`
,
`monthly_tolls_amount`
,
`monthly_improvement_surcharge`
,
`monthly_total_amount`
,
`monthly_congestion_surcharge`
,
`pickup_borough`
,
`pickup_zone`
,
`dropoff_borough`
,
`dropoff_zone`
,
`avg_duration_minutes`
,
`trip_count`
,
`total_duration_hours`
,
`busiest_hour_of_day`
,
`airport_trip`
,
`payment_type`
from -- depends_on: `stellarismusv5`.`models_seeds`.`seed_zones`
`stellarismusv5`.`models_historical`.`dbt__bi__yellow_m2m_dmt_49567973`
),
a_intersect_b as (
select * from a
intersect distinct
select * from b
),
a_except_b as (
select * from a
except distinct
select * from b
),
b_except_a as (
select * from b
except distinct
select * from a
),
all_records as (
select
*,
true as in_a,
true as in_b
from a_intersect_b
union all
select
*,
true as in_a,
false as in_b
from a_except_b
union all
select
*,
false as in_a,
true as in_b
from b_except_a
),
final as (
select * from all_records
where not (in_a and in_b)
order by in_a desc, in_b desc
)
select * from final
dbt__bi__yellow_m2m_dmt_49567973 is not a table that I have created. It is basically an identical copy of the table that is being tested, and it is even created in the same dataset as the original table.
while stellarismusv5.models_ut.dbt__bi__yellow_m2m_expected is the correct seed for expected output.
I checked issues first, and it does not look like this is a common issue, so I'd wager that this is something incorrect on my end.
I have spent the better part of the day fiddling around, trying things out, changing directories, names, nothing works. I cannot seem to
pinpoint the cause, so any hints would be appreciated.
Describe the bug
I was working on a personal project 5 months ago which had dbt, and this package, everything was working well. Now as I was trying to run the project again, it is not going as well.
Steps to reproduce
test_dbt__bi__yellow_m2m
anddbt__bi__yellow_m2m_expected
being seeds that have same configuration and are in the same directory and are also loaded to the same bigquery dataset. the 2 seeds originate from 2 csv files with the same names.Expected results
A big query job combining the 2 seeds and testing the results of apply a model to one of the input and comparing it to the output.
Actual results
the tool uses only 1 of the two table, the
expected_output
table, and attempts to create the other table.dbt__bi__yellow_m2m_dmt_49567973
is not a table that I have created. It is basically an identical copy of the table that is being tested, and it is even created in the same dataset as the original table.while
stellarismusv5
.models_ut
.dbt__bi__yellow_m2m_expected
is the correct seed for expected output.Screenshots and log output
System information
Which database are you using dbt with?
The output of
dbt --version
:Additional context
I checked issues first, and it does not look like this is a common issue, so I'd wager that this is something incorrect on my end. I have spent the better part of the day fiddling around, trying things out, changing directories, names, nothing works. I cannot seem to pinpoint the cause, so any hints would be appreciated.