microsoft / dbt-fabric

MIT License
65 stars 18 forks source link

Job failures due to RemoteDisconnect #109

Open McKnight-42 opened 6 months ago

McKnight-42 commented 6 months ago

We have a user who reports having to constantly cancel their jobs due to a closed connection via a RemoteDisconnect error. This is only happening in their dev environment while their prod runs as expected.

All connection settings between the environments are reported as being the same other than the database pointer.

Credential details

retries = 1
port = 1433
dbt = 1.7.3
fabric adapter = 1.7.1

they are using service principle auth connection method

16 threads for both dev & prod settings

currently, we have had them update their retries limit to 5 and that seems to be helping

prdpsvs commented 6 months ago

@McKnight-42 , You no longer need to use the port in the credential from 1.7.2 version. ActiveDirectoryServicePrincipal or Service Principal auth options are acceptable.

Can you paste the error? I can check if we can improve anything from the adapter point of view while establishing the connection?

Number of retries can help during transient failures. Is this a consistent issue? If retries are working, then mostly likely the network could be the reason.

prdpsvs commented 5 months ago

@McKnight-42 , Please see my comments here - https://github.com/microsoft/dbt-fabric/issues/112

prdpsvs commented 3 months ago

@McKnight-42 , Is the issue resolved, if not, can you share the actual error? FYI, here are the error codes from ODBC - https://learn.microsoft.com/en-us/sql/odbc/reference/appendixes/appendix-a-odbc-error-codes?view=sql-server-ver15

McKnight-42 commented 3 months ago

Hey @prdpsvs sorry wasn't getting pinged about these comment for some reason, I haven't seen any more communication about this will double check that the 1.7.4 patch fixed this

McKnight-42 commented 3 months ago

per the issue on our end we haven't heard anything else from the customer about it, so possibly fixed. i'd say your good to close this out for now if i need to reopen it I will.

taylorterwin commented 2 months ago

Hello @McKnight-42, Taylor here from dbt Solutions Engineering and we have reports that this is still occurring after 1.7.4 patch fix was released. For example:

There is no further trace on our end for the error that occurs.

`2024-05-06 01:05:41.227649 (Thread-1 (worker)): 01:05:41  Runtime Error in model WE_STAFF_ADDLINFO (models/_staging/welland_export/WE_STAFF_ADDLINFO.sql)
  ('IMC06', '[IMC06] [Microsoft][ODBC Driver 18 for SQL Server]The connection is broken and recovery is not possible. The connection is marked by the client driver as unrecoverable. No attempt was made to restore the connection. (0) (SQLExecDirectW)')`

whereas the versions are: Registered adapter: fabric=1.7.4 Running with dbt=1.7.13