Open benc-db opened 10 months ago
Thanks for raising this issue that you ran into during the 1.7.0 release @benc-db!
I will end up overriding all of this method (which is fairly long), just to accept 0-based indexing.
Yep, this makes sense, especially since we'd be unlikely to make a change prior to 1.7.0 being released.
Technically, there's nothing erroneous about the implementation of the test cases here. Rather, it's an ergonomic issue, so I'm going to re-label this as a feature request to rather than a bug.
Going forward, here's some ideas of how we could support adapters that have 0-based column indexes without needing so many changes.
Probably the easiest approach within dbt-core would be to add a parameter to expected_references_catalog
named something like initial_index
and have it default to initial_index=1
.
Then within dbt-databricks, you'd be able override expected_catalog
like this:
@pytest.fixture(scope="class")
def expected_catalog(self, project):
return expected_references_catalog(
project,
role=AnyString(),
id_type="bigint",
text_type="string",
time_type="timestamp",
bigint_type="bigint",
view_type="view",
table_type="table",
model_stats=_StatsLikeDict(),
initial_index=0,
)
Alternatively, we could instead define it as a fixture within dbt-core, like:
@pytest.fixture(scope="class")
def initial_index(self):
return 1
dbt-postgres has a macro named postgres__get_catalog_relations
. Let's pretend for a moment that it used 0-based indexing rather than 1-based indexing. Then I could convert it to be 1-based by updating this line like this instead:
col.attnum + 1 as column_index,
Lastly, we could define an adapter-wide interface in which the initial index could be defined and accessed.
In all of the scenarios above (except for the "controversial" one), the code here would be re-written to be something like:
"first_name": {
"name": "first_name",
"index": initial_index+1,
"type": text_type,
"comment": None,
},
summary_columns = {
"first_name": {
"name": "first_name",
"index": initial_index + 0,
"type": text_type,
"comment": None,
},
"ct": {
"name": "ct",
"index": initial_index + 1,
"type": bigint_type,
"comment": None,
},
}
In the short-term, I'd advocate for the "easiest approach" above (or the fixture-based alternative).
Agreed with either non-controversial approach. I started by doing the controversial approach honestly, then was uncertain about if there were other places that were going to break as a result :P.
Is this a new bug in dbt-core?
Current Behavior
I'm working on the 1.7.0 release for the Databricks adapter and I'm failing tests because expected_references_catalog assumes column indexing starts at 1, but in Databricks it starts at 0. In base_expected_catalog, in contrast, index is set to AnyInteger(). I will end up overriding all of this method (which is fairly long), just to accept 0-based indexing.
Expected Behavior
I can inherit the testing behavior without having the override this method to account for indexing differences.
Steps To Reproduce
Try to run an inheritor of BaseDocsGenReferences against a warehouse that uses 0-based indexing and watch it fail (or just look at the code and see that the indices are hard-coded).
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
other (mention it in "Additional Context")
Additional Context
No response