meltano / hub

The single source of truth for all Meltano plugins, including all available Singer Taps and Targets: https://hub.meltano.com
https://hub.meltano.com
50 stars 68 forks source link

Discrepancy between SDK-native `capabilities` and those defined in SDK-based taps/targets `.yml` files #203

Open MeltyBot opened 2 years ago

MeltyBot commented 2 years ago

Migrated from GitLab: https://gitlab.com/meltano/hub/-/issues/207

Originally created by @ReubenFrankel on 2022-03-03 23:42:37


Summary

Some SDK-based taps/targets have defined capabilites that do not include those supported by the SDK version they were created with at a minimum.

Currently, it is up to the contributor to specify the capabilities their tap or target supports. This is necessary for non-SDK based taps/targets, but not for SDK-based taps/targets since the SDK supports a number of capabilities out of the box.

Fix for current taps/targets

Create a script to find all SDK-based taps/targets and append any missing capabilities the SDK version it was created with supports.

Basic script draft

(start from _data/taps and _data/targets)

For each SDK-based tap/target (meltano_sdk: true):

  1. Get type (tap or target)
  2. Get pip_url
  3. Get current capabilities
  4. Create virtual environment
  5. Install tap/target from pip_url
  6. Run Python script sdk_capabilities_as_csv.py
  7. Update tap/target existing capabilities with those from script CSV output
# sdk_capabilities_as_csv.py
# Usage: python3 sdk_capabilities_as_csv.py <plugin-type>

import sys
import singer_sdk

def get_plugin_type_class(plugin_type: str) -> singer_sdk.PluginBase:
    try:
        if plugin_type == "tap":
            return singer_sdk.Tap

        elif plugin_type == "target":
            return singer_sdk.Target

        print(f"'{plugin_type}' is not a supported plugin type", file=sys.stderr)

    except AttributeError:
        print(f"'{plugin_type}' is not supported in this SDK version", file=sys.stderr)

    print("Falling back to base plugin capabilities", file=sys.stderr)
    return singer_sdk.PluginBase

def as_csv(values: list) -> str:
    return ",".join([str(v) for v in values])

if __name__ == "__main__":
    try:
        plugin_type = sys.argv[1]
    except IndexError:
        exit("Plugin type argument required")

    plugin_type_class = get_plugin_type_class(plugin_type)
    print(as_csv(plugin_type_class.capabilities))

Fix for future taps/targets

Create some kind of pre-commit hook or CI step to automatically infer a tap/target's capabilites. This could be achieved using --about and might apply to other fields, such as name and settings:

poetry run sdk-tap-countries-sample --about --format json
{
  "name": "sample-tap-countries",
  "version": "[could not be detected]",
  "sdk_version": "0.3.5",
  "capabilities": [
    "sync",
    "catalog",
    "state",
    "discover"
  ],
  "settings": {
    "type": "object",
    "properties": {}
  }
}

This would only be available to taps/targets created using SDK version 0.3.11 or later, however.

MeltyBot commented 2 years ago

View 2 previous comments from the original issue on GitLab