Received the following error, showing STRUCT is not supported:
Failed to pull com.github.kamu-cli.stargazers: 0: Internal error 1: This feature is not implemented: Unsupported SQL type Custom(ObjectName([Ident { value: "STRUCT", quote_style: None }]), ["id", "BIGINT", "login", "STRING"])
Currently we need to add a preprocessing step with jq to handle this, which is too complex. Can we support this feature during the read phase in DataFusion?
Below is the dataset definition I worked with:
kind: DatasetSnapshot
version: 1
content:
name: com.github.kamu-cli.stargazers
kind: Root
metadata:
- kind: SetPollingSource
fetch:
kind: Url
url: https://api.github.com/repos/kamu-data/kamu-cli/stargazers
headers:
- name: User-Agent
value: kamu
- name: Accept
value: application/vnd.github.star+json
read:
kind: Json
schema:
- starred_at TIMESTAMP
- user STRUCT(id BIGINT, login STRING)
preprocess:
kind: Sql
engine: datafusion
query: |
SELECT
starred_at as event_time,
user.id as user_id,
user.login as user_name
FROM input
merge:
kind: Snapshot
primaryKey:
- event_time
- user_id
- kind: SetInfo
description: Stars of the selected github repository.
Note that when using EthereumLogs source it is possible to end up with a root dataset containing nested data. Until we fully support nested structs we should detect this and add a nice warning.
I hit an issue while trying to pull a nested data source structured as follows:
Received the following error, showing STRUCT is not supported:
Failed to pull com.github.kamu-cli.stargazers: 0: Internal error 1: This feature is not implemented: Unsupported SQL type Custom(ObjectName([Ident { value: "STRUCT", quote_style: None }]), ["id", "BIGINT", "login", "STRING"])
Currently we need to add a preprocessing step with jq to handle this, which is too complex. Can we support this feature during the read phase in DataFusion?
Below is the dataset definition I worked with: