airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.89k stars 3.82k forks source link

[airbyte-cdk] ValueError: mutable default <class 'airbyte_cdk.sources.declarative.decoders.json_decoder.JsonDecoder'> for field _decoder is not allowed: use default_factory #33795

Closed maver1ck closed 1 month ago

maver1ck commented 6 months ago

Connector Name

source-nasa

Connector Version

1.0.0

What step the error happened?

Configuring a new connector

Relevant information

I'm trying to integrate YamlDeclarativeSource with Langchain based on this documentation: https://python.langchain.com/docs/integrations/document_loaders/airbyte_cdk

This is the part of the source

from langchain.document_loaders.airbyte import AirbyteCDKLoader
from source_nasa import SourceNasa 

config = {
    "api_key": "DEMO_KEY",
}

It's failing on import.

What can I do?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 2
      1 from langchain.document_loaders.airbyte import AirbyteCDKLoader
----> 2 from source_nasa import SourceNasa  # plug in your own source here
      4 config = {
      5     "api_key": "DEMO_KEY",
      6 }

File ~/src/airbyte-langchain/source_nasa/__init__.py:6
      1 #
      2 # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
      3 #
----> 6 from .source import SourceNasa
      8 __all__ = ["SourceNasa"]

File ~/src/airbyte-langchain/source_nasa/source.py:5
      1 #
      2 # Copyright (c) 2023 Airbyte, Inc., all rights reserved.
      3 #
----> 5 from airbyte_cdk.sources.declarative.yaml_declarative_source import YamlDeclarativeSource
      7 """
      8 This file provides the necessary constructs to interpret a provided declarative YAML configuration file into
      9 source connector.
     10 
     11 WARNING: Do not modify this file.
     12 """
     15 # Declarative Source

File ~/.virtualenvs/airbyte-langchain-jcqg/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/yaml_declarative_source.py:8
      5 import pkgutil
      7 import yaml
----> 8 from airbyte_cdk.sources.declarative.manifest_declarative_source import ManifestDeclarativeSource
      9 from airbyte_cdk.sources.declarative.types import ConnectionDefinition
     12 class YamlDeclarativeSource(ManifestDeclarativeSource):

File ~/.virtualenvs/airbyte-langchain-jcqg/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py:28
     26 from airbyte_cdk.sources.declarative.parsers.manifest_component_transformer import ManifestComponentTransformer
     27 from airbyte_cdk.sources.declarative.parsers.manifest_reference_resolver import ManifestReferenceResolver
---> 28 from airbyte_cdk.sources.declarative.parsers.model_to_component_factory import ModelToComponentFactory
     29 from airbyte_cdk.sources.declarative.types import ConnectionDefinition
     30 from airbyte_cdk.sources.message import MessageRepository

File ~/.virtualenvs/airbyte-langchain-jcqg/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:17
     15 from airbyte_cdk.sources.declarative.auth.oauth import DeclarativeSingleUseRefreshTokenOauth2Authenticator
     16 from airbyte_cdk.sources.declarative.auth.selective_authenticator import SelectiveAuthenticator
---> 17 from airbyte_cdk.sources.declarative.auth.token import (
     18     ApiKeyAuthenticator,
     19     BasicHttpAuthenticator,
     20     BearerAuthenticator,
     21     LegacySessionTokenAuthenticator,
     22 )
     23 from airbyte_cdk.sources.declarative.auth.token_provider import InterpolatedStringTokenProvider, SessionTokenProvider, TokenProvider
     24 from airbyte_cdk.sources.declarative.checks import CheckStream

File ~/.virtualenvs/airbyte-langchain-jcqg/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/auth/token.py:12
     10 import requests
     11 from airbyte_cdk.sources.declarative.auth.declarative_authenticator import DeclarativeAuthenticator
---> 12 from airbyte_cdk.sources.declarative.auth.token_provider import TokenProvider
     13 from airbyte_cdk.sources.declarative.interpolation.interpolated_string import InterpolatedString
     14 from airbyte_cdk.sources.declarative.requesters.request_option import RequestOption, RequestOptionType

File ~/.virtualenvs/airbyte-langchain-jcqg/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/auth/token_provider.py:31
     26     @abstractmethod
     27     def get_token(self) -> str:
     28         pass
---> 31 @dataclass
     32 class SessionTokenProvider(TokenProvider):
     33     login_requester: Requester
     34     session_token_path: List[str]

File /opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dataclasses.py:1230, in dataclass(cls, init, repr, eq, order, unsafe_hash, frozen, match_args, kw_only, slots, weakref_slot)
   1227     return wrap
   1229 # We're called as @dataclass without parens.
-> 1230 return wrap(cls)

File /opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dataclasses.py:1220, in dataclass.<locals>.wrap(cls)
   1219 def wrap(cls):
-> 1220     return _process_class(cls, init, repr, eq, order, unsafe_hash,
   1221                           frozen, match_args, kw_only, slots,
   1222                           weakref_slot)

File /opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dataclasses.py:958, in _process_class(cls, init, repr, eq, order, unsafe_hash, frozen, match_args, kw_only, slots, weakref_slot)
    955         kw_only = True
    956     else:
    957         # Otherwise it's a field of some type.
--> 958         cls_fields.append(_get_field(cls, name, type, kw_only))
    960 for f in cls_fields:
    961     fields[f.name] = f

File /opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dataclasses.py:815, in _get_field(cls, a_name, a_type, default_kw_only)
    811 # For real fields, disallow mutable defaults.  Use unhashable as a proxy
    812 # indicator for mutability.  Read the __hash__ attribute from the class,
    813 # not the instance.
    814 if f._field_type is _FIELD and f.default.__class__.__hash__ is None:
--> 815     raise ValueError(f'mutable default {type(f.default)} for field '
    816                      f'{f.name} is not allowed: use default_factory')
    818 return f

ValueError: mutable default <class 'airbyte_cdk.sources.declarative.decoders.json_decoder.JsonDecoder'> for field _decoder is not allowed: use default_factory

Relevant log output

No response

Contribute

natikgadzhi commented 3 months ago

I just hit the same problem in source-declarative-manifest with Python 3.11. Python 3.10 works fine.

stationeros commented 1 month ago

Is there a workaround for this. I am facing this issue in Python 3.10 as well

talweissler commented 1 month ago

The same is happening to me with source-zendesk-support. It started happening with version 2.6.0 of airbyte-source-zendesk-support (when it was moved to cdk:low-code).

MoranM commented 1 month ago

same here (Python 3.11) :(

natikgadzhi commented 1 month ago

I’ll take a look. It’s weird that you see this on Python 3.10, that should not happen.

aaronsteers commented 1 month ago

Just started running into this also, while working on the declarative manifest interop for PyAirbyte:

aaronsteers commented 1 month ago

Looks like this may be fixed by a PR from @natikgadzhi :

natikgadzhi commented 1 month ago

Yes, the fix for this just shipped in https://github.com/airbytehq/airbyte/pull/38846. @MoranM, @talweissler, @stationeros, we have not released a new CDK version yet, so you'd have to use the CDK from master for this to work.

If this is still a problem after a couple weeks / in the next CDK version, please tell me!

simonguertin commented 1 week ago

I have a similar issue with the PrestaShop connector, is this related to each connector or a shared code issue that should be fixed with a new version of the CDK ?

natikgadzhi commented 1 week ago

@simonguertin it's possible that a specific connector code just uses the same pattern (field with broken default value factory) that needs to be cleaned up to support 3.11. It's not the CDK issues per se (we've cleared that up), but specific to connector.

We'll work on getting all connectors to update to a new base image running Python 3.11 sometime soon, but in the meantime, you're very welcome to file an issue with repro steps about PrestaShop, and I would be very happy to help with reviewing if you put together a PR.

simonguertin commented 1 week ago

Thank you for the quick response!