Circus Train should be able to detect that a table is an Avro Table with a possibility of an external schema and should trigger the ct-avro transform automatically to copy over the external schema to the replica data lake.
Context
At the moment, we have a circus-train-avro transform which gets triggered only when the following configuration is provided in the CT config file:
If the configuration is not provided, CT treats the replication as a usual replication and as a result the replica table has the parameter avro.schema.url which is pointing to the source table's schema location which is not correct.
Proposed solution:
CT should be able to detect that the table which is being replicated is a Avro Table and hence should trigger the ct-avro transform and use the table's location as a default location for the schema.
Circus Train should be able to detect that a table is an Avro Table with a possibility of an external schema and should trigger the ct-avro transform automatically to copy over the external schema to the replica data lake.
Context
At the moment, we have a circus-train-avro transform which gets triggered only when the following configuration is provided in the CT config file:
If the configuration is not provided, CT treats the replication as a usual replication and as a result the replica table has the parameter
avro.schema.url
which is pointing to the source table's schema location which is not correct.Proposed solution:
CT should be able to detect that the table which is being replicated is a Avro Table and hence should trigger the ct-avro transform and use the table's location as a default location for the schema.