BuzzCutNorman / tap-mssql

Singer Tap for MS SQL built with Meltano Singer SDK.
MIT License
2 stars 9 forks source link

start_date not working #52

Closed tharwan closed 1 year ago

tharwan commented 1 year ago

Hi,

I am trying to use this tap with meltano to sync a big table. I would like to start the sync only starting from a specific point in time. My config looks like this:

plugins:
  extractors:
  - name: tap-mssql
    variant: buzzcutnorman
    pip_url: git+https://github.com/BuzzCutNorman/tap-mssql.git
    config:
      user: user
      driver_type: pyodbc
      host: host
      database: db 
      dialect: mssql
      sqlalchemy_url_query:
        driver: ODBC Driver 17 for SQL Server
      start_date: '2023-06-07T00:00:00'
    select:
    - dbo-time_series_meta.*
    - dbo-time_series.*
    metadata:
      dbo-time_series_meta.*:
        replication-method: FULL_TABLE
      dbo-time_series.*:
        replication-method: INCREMENTAL
        replication-key: updated

when I try to test what date gets extracted with meltano invoke tap-mssql I see the oldest data being emitted, not starting at start_date.

Is there any way to debug what is going on?

BuzzCutNorman commented 1 year ago

@tharwan Thanks for raising this issue. You did the correct debugging steps to see if the start_date was working. I tried a quick test scenario and found start_date to be working as expected. I was able to recreate the issue you are having my adding .* to the end of tap_stream_id in the metadata: section. If you would please give the below metadata: config a try and let me know if it resolves the issue for you.

    metadata:
      dbo-time_series_meta:
        replication-method: FULL_TABLE
      dbo-time_series:
        replication-method: INCREMENTAL
        replication-key: updated
tharwan commented 1 year ago

yes that works thanks!

vmesel commented 1 year ago

And what if the config is a plain config.json file. How should I add the INCREMENTAL syncs?

BuzzCutNorman commented 1 year ago

I believe you need to run --discovery and direct the output to a file. In the catalog you can say what tables and columns should be selected and what type of sync should be used. You can then use the --catalog flag to pass this back to tap-mssql when you run it. Here is a link I found that might be helpful.

https://github.com/singer-io/getting-started/blob/master/docs/DISCOVERY_MODE.md