MeltanoLabs / tap-gitlab

Singer.io Tap for extracting data from Gitlab's API
GNU Affero General Public License v3.0
8 stars 25 forks source link

Schema throws a validation error with certain targets #88

Closed ericboucher closed 1 year ago

ericboucher commented 1 year ago

[on release 2.0.0]

When using the target-postgres, we get the following error:

CRITICAL ('`schema` is an invalid JSON Schema instance: {"type": "SCHEMA", "stream": "vulnerabilities", "schema": {"properties": {"project_id": {"type": ["null", "integer"]}, "author_id": {"type": ["null", "integer"]}, "confidence": {"type": ["null", "string"]}, "created_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "description": {"type": ["null", "string"]}, "dismissed_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "dismissed_by_id": {"type": ["null", "integer"]}, "due_date": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "finding": {"properties": {"confidence": {"type": ["null", "string"]}, "created_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "id": {"type": ["null", "integer"]}, "location_fingerprint": {"type": ["null", "string"]}, "metadata_version": {"type": ["null", "string"]}, "name": {"type": ["null", "string"]}, "primary_identifier_id": {"type": ["null", "integer"]}, "project_fingerprint": {"type": ["null", "string"]}, "project_id": {"type": ["null", "integer"]}, "raw_metadata": {"type": ["null", "string"]}, "report_type": {"type": ["null", "string"]}, "scanner_id": {"type": ["null", "integer"]}, "severity": {"type": ["null", "string"]}, "updated_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "uuid": {"type": ["null", "string"]}, "vulnerability_id": {"type": ["null", "integer"]}}, "type": "object"}, "id": {"type": ["null", "integer"]}, "last_edited_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "last_edited_by_id": {"type": ["null", "integer"]}, "project": {"properties": {"created_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "description": {"type": ["null", "string"]}, "id": {"type": ["null", "integer"]}, "name": {"type": ["null", "string"]}, "name_with_namespace": {"type": ["null", "string"]}, "path": {"type": ["null", "string"]}, "path_with_namespace": {"type": ["null", "string"]}}, "title": {"type": ["null", "string"]}, "type": "object"}}, "type": "object"}, "key_properties": ["id"]}\n', "{'type': ['null', 'string']} is not of type 'string'\n\nFailed validating 'type' in schema['properties']['properties']['additionalProperties']['properties']['title']:\n    {'type': 'string'}\n\nOn instance['properties']['project']['title']:\n    {'type': ['null', 'string']}")
Traceback (most recent call last):
  File "src/load/.venv/bin/target-postgres", line 8, in <module>
    sys.exit(cli())
  File "src/load/.venv/lib/python3.10/site-packages/target_postgres/__init__.py", line 46, in cli
    main(args.config)
  File "src/load/.venv/lib/python3.10/site-packages/target_postgres/__init__.py", line 40, in main
    target_tools.main(postgres_target)
  File "src/load/.venv/lib/python3.10/site-packages/target_postgres/target_tools.py", line 28, in main
    stream_to_target(input_stream, target, config=config)
  File "src/load/.venv/lib/python3.10/site-packages/target_postgres/target_tools.py", line 77, in stream_to_target
    raise e
  File "src/load/.venv/lib/python3.10/site-packages/target_postgres/target_tools.py", line 58, in stream_to_target
    _line_handler(state_tracker,
  File "src/load/.venv/lib/python3.10/site-packages/target_postgres/target_tools.py", line 115, in _line_handler
    raise TargetError('`schema` is an invalid JSON Schema instance: {}'.format(line), *schema_validation_errors)
target_postgres.exceptions.TargetError: ('`schema` is an invalid JSON Schema instance: {"type": "SCHEMA", "stream": "vulnerabilities", "schema": {"properties": {"project_id": {"type": ["null", "integer"]}, "author_id": {"type": ["null", "integer"]}, "confidence": {"type": ["null", "string"]}, "created_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "description": {"type": ["null", "string"]}, "dismissed_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "dismissed_by_id": {"type": ["null", "integer"]}, "due_date": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "finding": {"properties": {"confidence": {"type": ["null", "string"]}, "created_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "id": {"type": ["null", "integer"]}, "location_fingerprint": {"type": ["null", "string"]}, "metadata_version": {"type": ["null", "string"]}, "name": {"type": ["null", "string"]}, "primary_identifier_id": {"type": ["null", "integer"]}, "project_fingerprint": {"type": ["null", "string"]}, "project_id": {"type": ["null", "integer"]}, "raw_metadata": {"type": ["null", "string"]}, "report_type": {"type": ["null", "string"]}, "scanner_id": {"type": ["null", "integer"]}, "severity": {"type": ["null", "string"]}, "updated_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "uuid": {"type": ["null", "string"]}, "vulnerability_id": {"type": ["null", "integer"]}}, "type": "object"}, "id": {"type": ["null", "integer"]}, "last_edited_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "last_edited_by_id": {"type": ["null", "integer"]}, "project": {"properties": {"created_at": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": "null"}]}, "description": {"type": ["null", "string"]}, "id": {"type": ["null", "integer"]}, "name": {"type": ["null", "string"]}, "name_with_namespace": {"type": ["null", "string"]}, "path": {"type": ["null", "string"]}, "path_with_namespace": {"type": ["null", "string"]}}, "title": {"type": ["null", "string"]}, "type": "object"}}, "type": "object"}, "key_properties": ["id"]}\n', "{'type': ['null', 'string']} is not of type 'string'\n\nFailed validating 'type' in schema['properties']['properties']['additionalProperties']['properties']['title']:\n    {'type': 'string'}\n\nOn instance['properties']['project']['title']:\n    {'type': ['null', 'string']}")

After investigating a bit and running the schema on https://www.jsonschemavalidator.net/, it seems that "title" is a protected name. And simply using "title_" fixes the validation error. I am wondering what the underlying issue is and how we could fix it. Maybe rename "title" to "project_title"? Curious if anyone else has encountered this issue?