dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
395 stars 221 forks source link

[ADAP-639] [CT-2708] [Feature] Add `rows affected` to AdapterResponse → `run_results.json` #812

Closed roberto-rosero closed 9 months ago

roberto-rosero commented 1 year ago

Is this your first time submitting a feature request?

Describe the feature

I noticed that in the differents versions of json template of run_results the "rows_affected" field was in there. I wonder if you tell me that the field come back or the best was to get it? It is a very important field for the controls in our models.

Thanks.

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

roberto-rosero commented 1 year ago

I use dbt-spark

dbeatty10 commented 1 year ago

Thank you for raising this @roberto-rosero !

Could you share your versions of dbt-core and dbt-spark? You can get these by running:

dbt --version

Could you also share the contents of adapter_response within your run_results.json?

I just tried with dbt-core==1.5.1 and dbt-postgres and my target/run_results.json did contain rows_affected (see below):

`target/run_results.json` ```json { "metadata": { "dbt_schema_version": "https://schemas.getdbt.com/dbt/run-results/v4.json", "dbt_version": "1.5.1", "generated_at": "2023-06-21T02:23:15.785454Z", "invocation_id": "aa9b4abc-383f-4fca-bce4-e87b1fd392b1", "env": {} }, "results": [ { "status": "success", "timing": [ { "name": "compile", "started_at": "2023-06-21T02:23:15.625258Z", "completed_at": "2023-06-21T02:23:15.628575Z" }, { "name": "execute", "started_at": "2023-06-21T02:23:15.629350Z", "completed_at": "2023-06-21T02:23:15.747414Z" } ], "thread_id": "Thread-1", "execution_time": 0.12428998947143555, "adapter_response": { "_message": "SELECT 1", "code": "SELECT", "rows_affected": 1 }, "message": "SELECT 1", "failures": null, "unique_id": "model.my_project.my_model" } ], "elapsed_time": 0.48162102699279785, "args": { } } ```
roberto-rosero commented 1 year ago

Hi @dbeatty10

Thanks for your response.

My actual versions of dbt-core and dbt-spark are 1.5.1 and 1.5.0 respectively.

"adapter_response": {"_message": "OK"} is the only tha appears in this segment.

I ran "dbt run --select mymodel"

dbeatty10 commented 1 year ago

@roberto-rosero After doing a little more research, the get_response method can vary across dbt adapters, and some may add rows_affected (like dbt-postgres) and others may not (like dbt-spark). Whether it's an option to include or not depends if the cursor object from the database driver includes the number of rows affected or not. DB API 2.0 includes the .rowcount cursor attribute, so it is often possible (but not always).

Which version of dbt-spark were you using that had rows_affected within run_results.json? Is it possible that you saw that field within another adapter?

I'm going to transfer this issue to dbt-spark since it is specific to that adapter whether the AdapterResponse includes rows_affected or not.

roberto-rosero commented 1 year ago

thanks @dbeatty10

In fact I thought that this field was populate in any adapater but according to what you say it is not so.

I'll wait for the respose by dbt-labs.

Thanks you so much.

jtcohen6 commented 1 year ago

We have another (older) issue for this!

roberto-rosero commented 1 year ago

Thanks @jtcohen6

BTW, It had been any advance with this issue? Do you know if it is possible get this field for Spark?

jtcohen6 commented 1 year ago

It seems unlikely.

It might be supported via ODBC: https://github.com/dbt-labs/dbt-spark/issues/142 / https://github.com/dbt-labs/dbt-spark/issues/497#issuecomment-1266302906.

But it looks like it's explicitly not supported by PyHive (always returns -1): https://github.com/dbt-labs/dbt-spark/issues/497#issuecomment-1574885586 links here

github-actions[bot] commented 9 months ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 9 months ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.