Open jmaddern-fw opened 1 week ago
Hi there, could you accomplish writing to multiple schemas that are present in the source by using the setting "namespaceFormat"?
If you select source
here, we will write to the namespace we detect from the source.
Or are you trying to write to a different namespace for each stream which is not present in the source?
Hi @nataliekwong - we are already using the equivalent of ${SOURCE_NAMESPACE}, but that isn't the problem.
Using this (also above) as an example:
- postgres_db
- schema_1
- table_1
- table_2
- schema_2
- table_1
- table_2
- etc.
Source: It is possible to have two separate schemas in the source as shown
Connection: Assuming all of the above schemas/tables in the example are in the same CDC replication slot/publication:
So the issue is that we cannot programmatically define multiple namespaces
@nataliekwong in addition to what is explained above - using public-api to create connections, within our payload - we want to be able to achieve something like:
"configurations": {
"streams": [
{'name': 'table_1',
'syncMode': 'full_refresh_overwrite',
'namespace': 'schema_1',
},
{'name': 'table_1',
'syncMode': 'full_refresh_overwrite',
'namespace': 'schema_2',
}
]
},
Ideally this should work as it does with server-ap but with public-api it returns an error -
duplicate stream found in configuration for table_1.
Meaning namespace does not create a unique stream as it previously did. ps: I have 1000+ schemas to read from, all with identical tables.
Topic
Add "Namespace" definition to "Stream"
Relevant information
I have a scenario where I have multiple (around 100) Postgres sources of identical schemas, similar to:
I have transitioned from octavia to terraform and can see that in that shift support for namespace within a stream has been removed from the new public-api, but was supported in the old server-api and continues to be supported in the UI. This is a limitation of public-api.
Additional Details:
The deprecated Configuration API (server-api) has the field "namespace" included in the "stream" object:
https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/connections/create
and you're able to send it to the backend. For example, in the browser, I can see that the POST payload to the: http://AIRBYTE_WEBAPP/api/v1/web_backend/connections/create looks like this:
At the same time, the public API has no same parameters for that: https://reference.airbyte.com/reference/createconnection
The object "configurations[].streams[]" has "name", "syncMode", "cursorField", "primaryKey" and selectedFields parameters only. Have no idea why the public-api is cut compared with server-api. The public API should really support the same features as the UI.