Open waleedarshad-vf opened 4 years ago
@dmosorast again your help is need
@all I have also added detailed reference here. https://stackoverflow.com/questions/62793871/keyerror-id-critical-exception-writing-records-keyerror-id-singer-i
@waleedarshad-vf Did you find the solution?
@bonysimon007 No, We moved to stitchdata which is the enterprised version and worked
From the looks of this article, my best guess is it's probably stemming from the schema not being specified in the catalog.json
https://titanwolf.org/Network/Articles/Article?AID=b4eb664e-70f4-430c-b568-bdd0d32bcf35#gsc.tab=0
I managed to get around this by using https://github.com/transferwise/pipelinewise-tap-mongodb and https://github.com/transferwise/pipelinewise-target-postgres
I wrote some Python code to edit the catalog that comes out after tap-mongodb --config config/singer/tap-mongodb-config.json --discover > catalog.json
import json
import pathlib
input_file = 'catalog.json'
output_file = 'mongo/catalog.json'
with open(input_file) as f:
data = json.load(f)
for stream in data['streams']:
stream_metadata = stream['metadata'][0]
stream_metadata['metadata']['selected'] = True
stream_metadata['metadata']['replication-method'] = 'FULL_TABLE'
# https://github.com/singer-io/getting-started/blob/master/docs/DISCOVERY_MODE.md
# https://github.com/singer-io/tap-mongodb/issues/48
stream['schema']['properties'] = {
"_id": {
"type": [
"string",
"null"
]
},
"document": {
"type": [
"object",
"array",
"string",
"null"
]
}
}
pathlib.Path(output_file).parent.mkdir(parents=True, exist_ok=True)
with open(output_file, 'w') as json_file:
json.dump(data, json_file)
Which will sync the entire document into a single JSONB column in PostgreSQL.
If you are then looking to expand the JSONB column into a full table in PostgreSQL, this can be done in SQL statements. Although if using camelCase for document keys in MongoDB then this causes funny behaviour due to snake_case in PostgreSQL. I was able to define the function given here: https://rudism.com/converting-json-keys-to-snake-case-in-postgresql/.
CREATE TEMPORARY TABLE snake_case (document jsonb);
INSERT INTO snake_case SELECT json_keys_to_snake_case FROM <SCHEMA_NAME>.<TABLE_NAME> m, json_keys_to_snake_case(m.document);
INSERT INTO <SCHEMA_NAME>.<TABLE_NAME> (
_id,
document_key1,
document_key2
)
SELECT
t."_id",
t."document_key1",
t."document_key2"
FROM
snake_case m
CROSS JOIN LATERAL
jsonb_to_record(m.document) as t(
_id TEXT,
document_key1 <DATA_TYPE>,
document_key2 <DATA_TYPE>
);
DROP TABLE IF EXISTS snake_case;
Where
Mongo to redshift. I got this error what can be the possible reasons