As of now, when data comes in the ingest endpoint, we just check that it is formatted properly. ie if the endpoint expects JSON we check that it is a JSON event, if the endpoint expects a JSON array we check for a Json Array.
However, ingestion endpoints actually are defined following a data model. Currently if an event comes to the ingestion endpoint and doesn't fit the schema but is valid JSON, the HTTP api will return a 200 SUCCESS. The event will then be put on a Kafka topic as is. The Syncing process, that syncs the data from the topic to the Table will then pick it up, if the schema doesn't correspond, the data will be dropped.
This is not a great user experience, we would like the user to have feedback as early as possible that they data doesn't fit the schema (Structure and data types).
If the event doesn't match the schema (type and structure) of the data model, the api should return a 400 and drop the data.
For dates - we should validate that they are ISO 8601 dates in string form
As of now, when data comes in the ingest endpoint, we just check that it is formatted properly. ie if the endpoint expects JSON we check that it is a JSON event, if the endpoint expects a JSON array we check for a Json Array.
However, ingestion endpoints actually are defined following a data model. Currently if an event comes to the ingestion endpoint and doesn't fit the schema but is valid JSON, the HTTP api will return a 200 SUCCESS. The event will then be put on a Kafka topic as is. The Syncing process, that syncs the data from the topic to the Table will then pick it up, if the schema doesn't correspond, the data will be dropped.
This is not a great user experience, we would like the user to have feedback as early as possible that they data doesn't fit the schema (Structure and data types).
If the event doesn't match the schema (type and structure) of the data model, the api should return a 400 and drop the data.
For dates - we should validate that they are ISO 8601 dates in string form