snowflakedb / snowflake-ingest-java

Java SDK for the Snowflake Ingest Service -
http://www.snowflake.net
Apache License 2.0
69 stars 54 forks source link

Do we need to send only structured or JSON(semi-structured) data when we use snowflake streaming #754

Closed prateekkohli21 closed 5 months ago

prateekkohli21 commented 5 months ago

Hi,

I have just started using Snowflake and have a basic query.

I am using sdk to send streaming data to Snowflake directly from my Java application. I am able to send JSON type of semi-structured data directly and store it in VARIANT column and parse that JSON in Snowflake.

But will it support Parquet or AVRO binary data if I send them directly through my Java application to a VARIANT column as this data is in binary format? Or will we have to implement a connector for converting that Parquet/AVRO Binary data to structured or JSON format before sending it to snowflake's VARIANT column?

As per my understanding, AVRO binary & Parquet data can be read from files that have its schema as well and it does not make sense to directly send AVRO binary or Parquet data to Snowflake as it won't be able to parse it.

Please let me know if my understanding is correct.

Thanks

sfc-gh-lsembera commented 5 months ago

Hi, for variant columns, the SDK only supports JSON for variant columns. You will need to write a piece of code that converts your Parquet/Avro to JSON.

cc @sfc-gh-xhuang

prateekkohli21 commented 5 months ago

Thanks @sfc-gh-lsembera.