Closed aaronsteers closed 1 month ago
Some tests are failing because we had to remove source-faker
as a dev dependency in order to get newer versions of the CDK to work. PR here bumps the CDK version in source-faker
so we can bring it back as a dev dependency:
@natikgadzhi, @erohmensing, @bindipankhudi, @alafanechere, @bnchrch - This is ready for your review.
Tests are all passing except Python 3.11 tests, which will be resolved soon via @natikgadzhi's CDK update here (just merged, pending release to PyPi): https://github.com/airbytehq/airbyte/pull/38846
The recent updates to the Airbyte module introduce new entities and functionalities, enhance existing modules, and add support for declarative YAML source testing. Key changes include adding the records
entity, importing snowflakecortex
, and integrating a base
module in the caches
. The sources
have been significantly updated with new classes and methods for handling declarative sources. Additionally, new example files and updated tests ensure robust handling of connectors and sources.
Files | Change Summaries |
---|---|
airbyte/__init__.py |
Added records entity; replaced experimental with records . |
airbyte/_processors/sql/__init__.py |
Added import statement for snowflakecortex . |
airbyte/caches/__init__.py |
Added import statement for base module. |
airbyte/sources/declarative.py |
Introduced classes DeclarativeExecutor and DeclarativeSource for YAML sources. |
airbyte/sources/registry.py |
Added imports, constants, Enums, attributes, and updated functions for connectors. |
airbyte/sources/util.py |
Added imports, parameters, and logic for handling YAML manifest in _get_source . |
examples/... |
Added run_declarative_manifest_source.py and run_downloadable_yaml_source.py . |
pyproject.toml |
Updated dependency versions for airbyte-cdk and airbyte-source-faker . |
tests/conftest.py |
Added imports, modified fixtures, and mocked registry behavior for testing. |
tests/integration_tests/... |
Added and modified test functions for connectors and sources. |
tests/unit_tests/... |
Added and modified test functions and fixtures for unit testing connectors. |
sequenceDiagram
participant User
participant Airbyte
participant DeclarativeExecutor
participant Source
User->>Airbyte: Run declarative manifest source
Airbyte->>DeclarativeExecutor: Initialize with manifest
DeclarativeExecutor->>Source: Execute source with manifest
Source-->>DeclarativeExecutor: Return data
DeclarativeExecutor-->>Airbyte: Processed data
Airbyte-->>User: Display data
In the realm of Airbyte's code,
New records and sources showed,
YAML manifests now take flight,
Bringing data to the light.
With tests and imports all aligned,
A seamless flow you'll surely find.
🐇✨
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
This adds the ability to run (in theory) 130 declarative yaml sources in PyAirbyte, without any need for additional virtual environment isolation. The
manifest.yml
file content can be provided by the user or auto-downloaded frommaster
branch ofairbytehq/airbyte
.Thanks to @bnchrch and @lmossman for helping figure out the logic.
The
get_source()
implementation inairbyte.experimental
includes a newsource_manifest
input argument.The argument can be any of these types:
Path
- A path to a local Yaml file.dict
- An already-parsed Yaml manifest.str
- A URL path to a Yaml manifest.True
- Indicates that PyAirbyte should find the yaml manifest at the default location, e.g.:The Yaml-runnable connectors can be found using
ab.get_available_connectors(install_type="yaml")
orab.get_available_connectors(InstallType.YAML)
This PR also adds hard-coded exclusions for connectors in three categories:
Usage example
See the 2 new scripts in the
examples
directory for more examples, but the simplest usage is just:In the above example, the source
manifest.yml
is automatically located frommaster
branch ofairbytehq/airbyte
, and the only change from the user perspective is to add the argsource_manifest=True
.Note
Included Connectors
This is the result of calling
get_available_connectors("yaml")
:Show/Hide
``` - source-activecampaign - source-aha - source-aircall - source-appfollow - source-apple-search-ads - source-ashby - source-auth0 - source-babelforce - source-breezometer - source-callrail - source-captain-data - source-chargify - source-chartmogul - source-clickup-api - source-clockify - source-coda - source-coin-api - source-coingecko-coins - source-coinmarketcap - source-configcat - source-confluence - source-convertkit - source-copper - source-datadog - source-datascope - source-delighted - source-dixa - source-dockerhub - source-dremio - source-drift - source-emailoctopus - source-exchange-rates - source-flexport - source-freshcaller - source-freshsales - source-freshservice - source-fullstory - source-gainsight-px - source-getlago - source-glassfrog - source-gocardless - source-gong - source-google-pagespeed-insights - source-google-webfonts - source-gutendex - source-harvest - source-hellobaton - source-hubplanner - source-insightly - source-intruder - source-ip2whois - source-k6-cloud - source-klarna - source-klaus-api - source-launchdarkly - source-lemlist - source-lever-hiring - source-lokalise - source-mailerlite - source-mailersend - source-mailgun - source-mailjet-mail - source-mailjet-sms - source-marketo - source-merge - source-metabase - source-microsoft-teams - source-n8n - source-nasa - source-news-api - source-newsdata - source-nytimes - source-omnisend - source-onesignal - source-open-exchange-rates - source-openweather - source-opsgenie - source-orbit - source-oura - source-pendo - source-persistiq - source-pexels-api - source-pivotal-tracker - source-plaid - source-plausible - source-pokeapi - source-polygon-stock-api - source-postmarkapp - source-primetric - source-punk-api - source-pypi - source-recreation - source-recruitee - source-reply-io - source-ringcentral - source-rocket-chat - source-sap-fieldglass - source-secoda - source-sendgrid - source-sendinblue - source-sentry - source-serpstat - source-smartengage - source-sonar-cloud - source-spacex-api - source-square - source-statuspage - source-strava - source-survey-sparrow - source-tempo - source-timely - source-tmdb - source-todoist - source-toggl - source-tvmaze-schedule - source-twilio-taskrouter - source-twitter - source-tyntec-sms - source-visma-economic - source-vitally - source-waiteraid - source-whisky-hunter - source-wikipedia-pageviews - source-workable - source-workramp - source-wrike - source-yahoo-finance-price - source-yotpo - source-zapier-supported-storage - source-zenefits ```
Hard-coded exclusions have been removed from this list, for instance, those low-code connectors that require one or more python code files.
Summary by CodeRabbit
New Features
Enhancements
ConnectorMetadata
to include language and installation types.get_available_connectors
to handle different installation types.Dependencies
airbyte-cdk
to^1.2.1
.airbyte-source-faker
to^6.1.2
.Tests