airbytehq / PyAirbyte

PyAirbyte brings the power of Airbyte to every Python developer.
https://docs.airbyte.com/pyairbyte
Other
235 stars 42 forks source link

Offline mode #428

Closed aurany closed 3 weeks ago

aurany commented 1 month ago

We're evaluating pyairbyte in our offline environment. When trying to run some simple examples we're getting connection errors since outgoing traffic is blocked. Is it possible to run pyairbyte in offline mode?

Simple example

import airbyte as ab

source = ab.get_source(
    "source-faker",
    config={"count": 5},
    install_if_missing=False,
)

source.check()

Error requests.exceptions.ConnectionError: HTTPSConnectionPool(host='connectors.airbyte.com', port=443): Max retries exceeded with url: /files/registries/v0/oss_registry.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f6271a80610>: Failed to establish a new connection: [Errno 111] Connection refused'))

EDIT: Also mentioned here: https://github.com/airbytehq/airbyte/discussions/45606

aaronsteers commented 1 month ago

Hi, @aurany. Thanks for raising this!

That I can think of, there are two places where Airbyte makes remote calls. The first is telemetry, which you can disable with DO_NOT_TRACK=1 as described here: https://docs.airbyte.com/operator-guides/telemetry

The other is the connector registry lookup which you mention above - this can sometimes help us locate metadata about the connector, but when the user provides their own config, this is again not strictly required for any core function.

Neither of these remote calls are critical to PyAirbyte's core function. What we probably need to do is to add error handling to the connector registry lookup - failing gracefully and not fatally if remote connection is not available.

Alternatively - we could add a new environment variable to either (1) set AIRBYTE_CONNECTOR_REGISTRY to something like False to skip checking. Or create something like AIRBYTE_OFFLINE_MODE=True.

There are probably more solutions to this problem, but above are the ideas that immediately come to mind.

Would you be interested in contributing a fix for this issue? If so, I'm happy to work with you on a path like one described above or another idea if you have a different proposal.

aaronsteers commented 1 month ago

@aurany - I see another community member (@ziggekatten) has run into a similar question and found a workaround here:

aurany commented 1 month ago

@aurany - I see another community member (@ziggekatten) has run into a similar question and found a workaround here:

Thank you very much for this! This seems to work as a temporary solution when evaluating.

Would you be interested in contributing a fix for this issue? If so, I'm happy to work with you on a path like one described above or another idea if you have a different proposal.

Yes! We'll start with some evaluation first to see if pyairbyte is for us =)

aaronsteers commented 1 month ago

Thanks, @aurany! Glad this was helpful! Feel free to drop another issue if you run into more questions - or find us in Slack if you have more questions.

I'll leave this issue open and will add the accepting pull requests label in the meanwhile.

aaronsteers commented 3 weeks ago

This has shipped. Thanks to @niyasrad for contributing this feature! 🙏 🎉