dlt-hub / verified-sources

Contribute to dlt verified sources 🔥
https://dlthub.com/docs/walkthroughs/add-a-verified-source
Apache License 2.0
48 stars 38 forks source link

rest_api: Allow to specify a transformation function for the cursor field in incremental load configuration #507

Open burnash opened 1 week ago

burnash commented 1 week ago

Background

Currently when defining incremental load configuration, the value type of the cursor field is expected to be the same as the value of the parameter passed to the API.

For example, if the API expects a numeric timestamp value (e.g. 1718991859) for the date_from parameter, the cursor field should also be a numeric timestamp value.

Problem

Some APIs expect a date string (e.g. "2022-01-01") for the querystring parameter, but the response includes a numeric timestamp value.

Proposal

Allow users to specify a transformation function that will be applied to the cursor field value before it is used to query the API.

"from_date": {
    "type": "incremental",
    "cursor_path": "time",
    "start_value": date_from,
    "transform": "epoch_to_date"
},

Or using a Python function:

def epoch_to_date(epoch):
    return datetime.fromtimestamp(epoch).strftime('%Y-%m-%d')
"from_date": {
    "type": "incremental",
    "cursor_path": "time",
    "start_value": date_from,
    "transform": epoch_to_date
},

Relevant Slack discussions:

Implementation