apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.23k stars 2.38k forks source link

unable to load the data from avro file into Hudi table #11588

Open Pavan792reddy opened 1 week ago

Pavan792reddy commented 1 week ago

Tips before filing an issue

Describe the problem you faced

A clear and concise description of the problem.

we are trying to load the data into hudi table but facing the below error .

Steps to reproduce the behavior:

1.df3.write.format("org.apache.hudi"). option("hoodie.datasource.write.precombine.field", "email"). option("hoodie.datasource.write.recordkey.field", "ssn"). option("hoodie.datasource.write.partitionpath.field", "address"). option("hoodie.database.name", "test_q6"). option("hoodie.table.name", "bnr_dl_avro"). option("hoodie.datasource.write.table.type", "COPY_ON_WRITE"). option("hoodie.datasource.write.operation", "upsert"). option("hoodie.datasource.write.hive_style_partitioning","true"). option("hoodie.datasource.meta.sync.enable", "true"). option("hoodie.datasource.hive_sync.mode", "hms"). option("hoodie.schema.on.read.enable", "false"). option("hoodie.datasource.hive_sync.metastore.uris", "thrift://dp2-1-vpc-m:9083"). option("checkpointLocation", "gs://test-dp-hudi/checkpoints_test").mode("Overwrite").save("gs://test-dp-hudi/bnr_dl_avro")

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

WARN HoodieSparkSqlWriter$: hoodie table at gs://test-dp-hudi/bnr_dl_avro already exists. Deleting existing data & overwriting with new data. java.lang.ClassCastException: class org.apache.spark.sql.types.StructType cannot be cast to class org.apache.spark.sql.types.MapType (org.apache.spark.sql.types.StructType and org.apache.spark.sql.types.MapType are in unnamed module of loader 'app')

Pavan792reddy commented 1 week ago

below is the schema details of the data frame which we are trying to load the data
Schema:----

root |-- id: string (nullable = false) |-- client_name: array (nullable = false) | |-- element: string (containsNull = false) |-- data_comment: string (nullable = false) |-- company: struct (nullable = true) | |-- foi: struct (nullable = true) | | |-- account_num: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- address: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- phone: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- username: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- email: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- ip: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- ssn: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- password: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- card_num: array (nullable = false) | | | |-- element: string (containsNull = false) | |-- credentials: struct (nullable = true) | | |-- username: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- password: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- hash: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- password_cleartext: boolean (nullable = false) | | |-- salt: string (nullable = false) | | |-- type: string (nullable = false) | | |-- notes: string (nullable = false) | |-- email: array (nullable = false) | | |-- element: string (containsNull = false) | |-- bitcoin: array (nullable = false) | | |-- element: string (containsNull = false) | |-- webmoney: array (nullable = false) | | |-- element: string (containsNull = false) | |-- mtcn: array (nullable = false) | | |-- element: string (containsNull = false) | |-- hash: array (nullable = false) | | |-- element: string (containsNull = false) | |-- base64str: string (nullable = false) | |-- base64clear: string (nullable = false) | |-- creditcard: array (nullable = false) | | |-- element: string (containsNull = false) | |-- phone: array (nullable = false) | | |-- element: string (containsNull = false) | |-- ssn: array (nullable = false) | | |-- element: string (containsNull = false) | |-- name: array (nullable = false) | | |-- element: string (containsNull = false) | |-- gender: array (nullable = false) | | |-- element: string (containsNull = false) | |-- dob: array (nullable = false) | | |-- element: string (containsNull = false) | |-- address: array (nullable = false) | | |-- element: string (containsNull = false) | |-- trackdata: array (nullable = false) | | |-- element: string (containsNull = false) | |-- fingerprint: string (nullable = false) | |-- process_name: string (nullable = false) | |-- machine_name: string (nullable = false) | |-- machine_domain: string (nullable = false) | |-- machine_user_name: string (nullable = false) | |-- panel_comment: string (nullable = false) | |-- first_seen: string (nullable = false) | |-- last_seen: string (nullable = false) | |-- wallet: array (nullable = false) | | |-- element: string (containsNull = false) | |-- payment_card: struct (nullable = true) | | |-- card_number: string (nullable = false) | | |-- cvv: string (nullable = false) | | |-- exp: string (nullable = false) | | |-- name: string (nullable = false) | | |-- address: string (nullable = false) | | |-- zip: string (nullable = false) | | |-- city: string (nullable = false) | | |-- state: string (nullable = false) | | |-- country: string (nullable = false) | | |-- phone: string (nullable = false) | |-- bot: struct (nullable = true) | | |-- bot_id: string (nullable = false) | | |-- report_id: string (nullable = false) | | |-- malware_type: string (nullable = false) | | |-- source_name: string (nullable = false) | | |-- report_date: string (nullable = false) | | |-- memo: string (nullable = false) | | |-- inject_url_pattern: string (nullable = false) | | |-- inject_name: string (nullable = false) | |-- header: struct (nullable = true) | | |-- header: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- map: map (nullable = false) | | | |-- key: string | | | |-- value: string (valueContainsNull = false) | | |-- useragent: struct (nullable = true) | | | |-- operating_system_name_version: string (nullable = false) | | | |-- agent_name_version_major: string (nullable = false) | | | |-- minor: string (nullable = false) | | | |-- agent_version: string (nullable = false) | | | |-- device_class: string (nullable = false) | | | |-- operating_system_version: string (nullable = false) | | | |-- layout_engine_name_version_major: string (nullable = false) | | | |-- operating_system_class: string (nullable = false) | | | |-- major: string (nullable = false) | | | |-- device_cpu_bits: string (nullable = false) | | | |-- agent_class: string (nullable = false) | | | |-- layout_engine_version_major: string (nullable = false) | | | |-- layout_engine_class: string (nullable = false) | | | |-- layout_engine_name_version: string (nullable = false) | | | |-- agent_name: string (nullable = false) | | | |-- layout_engine_name: string (nullable = false) | | | |-- device_cpu: string (nullable = false) | | | |-- operating_system_name: string (nullable = false) | | | |-- os: string (nullable = false) | | | |-- device_brand: string (nullable = false) | | | |-- raw: string (nullable = false) | | | |-- layout_engine_version: string (nullable = false) | | | |-- agent_name_version: string (nullable = false) | | | |-- agent_version_major: string (nullable = false) | | | |-- name: string (nullable = false) | | | |-- os_name: string (nullable = false) | | | |-- device: string (nullable = false) | | | |-- device_name: string (nullable = false) | | |-- cookies: map (nullable = false) | | | |-- key: string | | | |-- value: string (valueContainsNull = false) | | |-- cookie: string (nullable = false) | | |-- authorization: struct (nullable = true) | | | |-- type: string (nullable = false) | | | |-- creds: string (nullable = false) | | |-- referer_url: struct (nullable = true) | | | |-- url: string (nullable = false) | | | |-- host: string (nullable = false) | | | |-- domain: string (nullable = false) | | | |-- port: integer (nullable = false) | | | |-- path: string (nullable = false) | | | |-- proto: string (nullable = false) | | | |-- params: string (nullable = false) | | | |-- param_map: array (nullable = true) | | | | |-- element: struct (containsNull = true) | | | | | |-- member0: map (nullable = true) | | | | | | |-- key: string | | | | | | |-- value: struct (valueContainsNull = true) | | | | | | | |-- member0: integer (nullable = true) | | | | | | | |-- member1: long (nullable = true) | | | | | | | |-- member2: string (nullable = true) | | | | | | | |-- member3: boolean (nullable = true) | | | | | | | |-- member4: float (nullable = true) | | | | | | | |-- member5: double (nullable = true) | | | | | |-- member1: integer (nullable = true) | | | | | |-- member2: long (nullable = true) | | | | | |-- member3: string (nullable = true) | | | | | |-- member4: boolean (nullable = true) | | | | | |-- member5: float (nullable = true) | | | | | |-- member6: double (nullable = true) | | | |-- query_params: map (nullable = false) | | | | |-- key: string | | | | |-- value: string (valueContainsNull = false) | | | |-- username: string (nullable = false) | | | |-- password: string (nullable = false) | | | |-- verb: string (nullable = false) | | | |-- ip: struct (nullable = true) | | | | |-- addr: string (nullable = false) | | | | |-- geoip: struct (nullable = true) | | | | | |-- latitude: float (nullable = false) | | | | | |-- longitude: float (nullable = false) | | | | | |-- location: struct (nullable = true) | | | | | | |-- lat: float (nullable = false) | | | | | | |-- lon: float (nullable = false) | | | | | |-- city_name: string (nullable = false) | | | | | |-- continent_code: string (nullable = false) | | | | | |-- country_code: string (nullable = false) | | | | | |-- country_code2: string (nullable = false) | | | | | |-- country_code3: string (nullable = false) | | | | | |-- country_name: string (nullable = false) | | | | | |-- dma_code: integer (nullable = false) | | | | | |-- ip: string (nullable = false) | | | | | |-- postal_code: string (nullable = false) | | | | | |-- region_code: string (nullable = false) | | | | | |-- region_name: string (nullable = false) | | | | | |-- timezone: string (nullable = false) | | | | | |-- asn: long (nullable = false) | | | | | |-- asn_org: string (nullable = false) | | | | | |-- isp: string (nullable = false) | | | | | |-- organization_name: string (nullable = false) | | | | | |-- connection_type: string (nullable = false) | | | | |-- cidr: string (nullable = false) | | | | |-- rdns: struct (nullable = true) | | | | | |-- host: string (nullable = false) | | | | | |-- domain: string (nullable = false) | | | | | |-- sub_domain: string (nullable = false) | | | | | |-- hostname: string (nullable = false) | | | | | |-- time: timestamp (nullable = true) | | | | |-- port: integer (nullable = false) | | | | |-- threat_infra: struct (nullable = true) | | | | | |-- ids: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- tags: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- client: struct (nullable = true) | | | | | |-- id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- infection: struct (nullable = true) | | | | | |-- id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- bot_id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- malware: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- direct_collect: struct (nullable = true) | | | | | |-- id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- source_names: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- asn_class: struct (nullable = true) | | | | | |-- id: long (nullable = false) | | | | | |-- org_id: integer (nullable = false) | | | | | |-- name: string (nullable = false) | | | | | |-- aka: string (nullable = false) | | | | | |-- name_long: string (nullable = false) | | | | | |-- website: string (nullable = false) | | | | | |-- asn: long (nullable = false) | | | | | |-- looking_glass: string (nullable = false) | | | | | |-- route_server: string (nullable = false) | | | | | |-- irr_as_set: string (nullable = false) | | | | | |-- info_type: string (nullable = false) | | | | | |-- info_prefixes4: long (nullable = false) | | | | | |-- info_prefixes6: long (nullable = false) | | | | | |-- info_traffic: string (nullable = false) | | | | | |-- info_ratio: string (nullable = false) | | | | | |-- info_scope: string (nullable = false) | | | | | |-- info_unicast: boolean (nullable = false) | | | | | |-- info_multicast: boolean (nullable = false) | | | | | |-- info_ipv6: boolean (nullable = false) | | | | | |-- info_never_via_route_servers: boolean (nullable = false) | | | | | |-- ix_count: long (nullable = false) | | | | | |-- fac_count: long (nullable = false) | | | | | |-- notes: string (nullable = false) | | | | | |-- netixlan_updated: timestamp (nullable = true) | | | | | |-- netfac_updated: timestamp (nullable = true) | | | | | |-- poc_updated: timestamp (nullable = true) | | | | | |-- policy_url: string (nullable = false) | | | | | |-- policy_general: string (nullable = false) | | | | | |-- policy_locations: string (nullable = false) | | | | | |-- policy_ratio: boolean (nullable = false) | | | | | |-- policy_contracts: string (nullable = false) | | | | | |-- netfac_set: array (nullable = false) | | | | | | |-- element: struct (containsNull = false) | | | | | | | |-- id: long (nullable = false) | | | | | | | |-- name: string (nullable = false) | | | | | | | |-- city: string (nullable = false) | | | | | | | |-- fac_id: long (nullable = false) | | | | | | | |-- fac: string (nullable = false) | | | | | | | |-- local_asn: long (nullable = false) | | | | | | | |-- created: timestamp (nullable = true) | | | | | | | |-- updated: timestamp (nullable = true) | | | | | | | |-- status: string (nullable = false) | | | | | |-- netixlan_set: array (nullable = false) | | | | | | |-- element: struct (containsNull = false) | | | | | | | |-- id: long (nullable = false) | | | | | | | |-- ix_id: string (nullable = false) | | | | | | | |-- name: string (nullable = false) | | | | | | | |-- ixlan_id: string (nullable = false) | | | | | | | |-- ixlan: string (nullable = false) | | | | | | | |-- notes: string (nullable = false) | | | | | | | |-- speed: long (nullable = false) | | | | | | | |-- asn: long (nullable = false) | | | | | | | |-- ipaddr4: string (nullable = false) | | | | | | | |-- ipaddr6: string (nullable = false) | | | | | | | |-- is_rs_peer: boolean (nullable = false) | | | | | | | |-- operational: boolean (nullable = false) | | | | | | | |-- created: timestamp (nullable = true) | | | | | | | |-- updated: timestamp (nullable = true) | | | | | | | |-- status: string (nullable = false) | | | | | |-- poc_set: array (nullable = false) | | | | | | |-- element: struct (containsNull = false) | | | | | | | |-- id: long (nullable = false) | | | | | | | |-- role: string (nullable = false) | | | | | | | |-- visible: string (nullable = false) | | | | | | | |-- name: string (nullable = false) | | | | | | | |-- phone: string (nullable = false) | | | | | | | |-- email: string (nullable = false) | | | | | | | |-- url: string (nullable = false) | | | | | | | |-- created: timestamp (nullable = true) | | | | | | | |-- updated: timestamp (nullable = true) | | | | | | | |-- status: string (nullable = false) | | | | | |-- allow_ixp_update: boolean (nullable = false) | | | | | |-- suggest: boolean (nullable = false) | | | | | |-- status_dashboard: string (nullable = false) | | | | | |-- rir_status: string (nullable = false) | | | | | |-- rir_status_updated: timestamp (nullable = true) | | | | | |-- created: timestamp (nullable = true) | | | | | |-- updated: timestamp (nullable = true) | | | | | |-- status: string (nullable = false) | | | | |-- addr_v4: long (nullable = false) | | | | |-- addr_v6_host: long (nullable = false) | | | | |-- addr_v6_network: long (nullable = false) | |-- session: struct (nullable = true) | | |-- map: map (nullable = false) | | | |-- key: string | | | |-- value: string (valueContainsNull = false) | | |-- keys: array (nullable = false) | | | |-- element: string (containsNull = false) | | |-- keylog: string (nullable = false) | | |-- session: string (nullable = false) | | |-- json: string (nullable = false) | | |-- xml: string (nullable = false) | | |-- raw: string (nullable = false) | |-- ip: struct (nullable = true) | | |-- addr: string (nullable = false) | | |-- geoip: struct (nullable = true) | | | |-- latitude: float (nullable = false) | | | |-- longitude: float (nullable = false) | | | |-- location: struct (nullable = true) | | | | |-- lat: float (nullable = false) | | | | |-- lon: float (nullable = false) | | | |-- city_name: string (nullable = false) | | | |-- continent_code: string (nullable = false) | | | |-- country_code: string (nullable = false) | | | |-- country_code2: string (nullable = false) | | | |-- country_code3: string (nullable = false) | | | |-- country_name: string (nullable = false) | | | |-- dma_code: integer (nullable = false) | | | |-- ip: string (nullable = false) | | | |-- postal_code: string (nullable = false) | | | |-- region_code: string (nullable = false) | | | |-- region_name: string (nullable = false) | | | |-- timezone: string (nullable = false) | | | |-- asn: long (nullable = false) | | | |-- asn_org: string (nullable = false) | | | |-- isp: string (nullable = false) | | | |-- organization_name: string (nullable = false) | | | |-- connection_type: string (nullable = false) | | |-- cidr: string (nullable = false) | | |-- rdns: struct (nullable = true) | | | |-- host: string (nullable = false) | | | |-- domain: string (nullable = false) | | | |-- sub_domain: string (nullable = false) | | | |-- hostname: string (nullable = false) | | | |-- time: timestamp (nullable = true) | | |-- port: integer (nullable = false) | | |-- threat_infra: struct (nullable = true) | | | |-- ids: array (nullable = false) | | | | |-- element: string (containsNull = false) | | | |-- tags: array (nullable = false) | | | | |-- element: string (containsNull = false) | | |-- client: struct (nullable = true) | | | |-- id: array (nullable = false) | | | | |-- element: string (containsNull = false) | | |-- infection: struct (nullable = true) | | | |-- id: array (nullable = false) | | | | |-- element: string (containsNull = false) | | | |-- bot_id: array (nullable = false) | | | | |-- element: string (containsNull = false) | | | |-- malware: array (nullable = false) | | | | |-- element: string (containsNull = false) | | |-- direct_collect: struct (nullable = true) | | | |-- id: array (nullable = false) | | | | |-- element: string (containsNull = false) | | | |-- source_names: array (nullable = false) | | | | |-- element: string (containsNull = false) | | |-- asn_class: struct (nullable = true) | | | |-- id: long (nullable = false) | | | |-- org_id: integer (nullable = false) | | | |-- name: string (nullable = false) | | | |-- aka: string (nullable = false) | | | |-- name_long: string (nullable = false) | | | |-- website: string (nullable = false) | | | |-- asn: long (nullable = false) | | | |-- looking_glass: string (nullable = false) | | | |-- route_server: string (nullable = false) | | | |-- irr_as_set: string (nullable = false) | | | |-- info_type: string (nullable = false) | | | |-- info_prefixes4: long (nullable = false) | | | |-- info_prefixes6: long (nullable = false) | | | |-- info_traffic: string (nullable = false) | | | |-- info_ratio: string (nullable = false) | | | |-- info_scope: string (nullable = false) | | | |-- info_unicast: boolean (nullable = false) | | | |-- info_multicast: boolean (nullable = false) | | | |-- info_ipv6: boolean (nullable = false) | | | |-- info_never_via_route_servers: boolean (nullable = false) | | | |-- ix_count: long (nullable = false) | | | |-- fac_count: long (nullable = false) | | | |-- notes: string (nullable = false) | | | |-- netixlan_updated: timestamp (nullable = true) | | | |-- netfac_updated: timestamp (nullable = true) | | | |-- poc_updated: timestamp (nullable = true) | | | |-- policy_url: string (nullable = false) | | | |-- policy_general: string (nullable = false) | | | |-- policy_locations: string (nullable = false) | | | |-- policy_ratio: boolean (nullable = false) | | | |-- policy_contracts: string (nullable = false) | | | |-- netfac_set: array (nullable = false) | | | | |-- element: struct (containsNull = false) | | | | | |-- id: long (nullable = false) | | | | | |-- name: string (nullable = false) | | | | | |-- city: string (nullable = false) | | | | | |-- fac_id: long (nullable = false) | | | | | |-- fac: string (nullable = false) | | | | | |-- local_asn: long (nullable = false) | | | | | |-- created: timestamp (nullable = true) | | | | | |-- updated: timestamp (nullable = true) | | | | | |-- status: string (nullable = false) | | | |-- netixlan_set: array (nullable = false) | | | | |-- element: struct (containsNull = false) | | | | | |-- id: long (nullable = false) | | | | | |-- ix_id: string (nullable = false) | | | | | |-- name: string (nullable = false) | | | | | |-- ixlan_id: string (nullable = false) | | | | | |-- ixlan: string (nullable = false) | | | | | |-- notes: string (nullable = false) | | | | | |-- speed: long (nullable = false) | | | | | |-- asn: long (nullable = false) | | | | | |-- ipaddr4: string (nullable = false) | | | | | |-- ipaddr6: string (nullable = false) | | | | | |-- is_rs_peer: boolean (nullable = false) | | | | | |-- operational: boolean (nullable = false) | | | | | |-- created: timestamp (nullable = true) | | | | | |-- updated: timestamp (nullable = true) | | | | | |-- status: string (nullable = false) | | | |-- poc_set: array (nullable = false) | | | | |-- element: struct (containsNull = false) | | | | | |-- id: long (nullable = false) | | | | | |-- role: string (nullable = false) | | | | | |-- visible: string (nullable = false) | | | | | |-- name: string (nullable = false) | | | | | |-- phone: string (nullable = false) | | | | | |-- email: string (nullable = false) | | | | | |-- url: string (nullable = false) | | | | | |-- created: timestamp (nullable = true) | | | | | |-- updated: timestamp (nullable = true) | | | | | |-- status: string (nullable = false) | | | |-- allow_ixp_update: boolean (nullable = false) | | | |-- suggest: boolean (nullable = false) | | | |-- status_dashboard: string (nullable = false) | | | |-- rir_status: string (nullable = false) | | | |-- rir_status_updated: timestamp (nullable = true) | | | |-- created: timestamp (nullable = true) | | | |-- updated: timestamp (nullable = true) | | | |-- status: string (nullable = false) | | |-- addr_v4: long (nullable = false) | | |-- addr_v6_host: long (nullable = false) | | |-- addr_v6_network: long (nullable = false) | |-- location: struct (nullable = true) | | |-- geoip: struct (nullable = true) | | | |-- latitude: float (nullable = false) | | | |-- longitude: float (nullable = false) | | | |-- location: struct (nullable = true) | | | | |-- lat: float (nullable = false) | | | | |-- lon: float (nullable = false) | | | |-- city_name: string (nullable = false) | | | |-- continent_code: string (nullable = false) | | | |-- country_code: string (nullable = false) | | | |-- country_code2: string (nullable = false) | | | |-- country_code3: string (nullable = false) | | | |-- country_name: string (nullable = false) | | | |-- dma_code: integer (nullable = false) | | | |-- ip: string (nullable = false) | | | |-- postal_code: string (nullable = false) | | | |-- region_code: string (nullable = false) | | | |-- region_name: string (nullable = false) | | | |-- timezone: string (nullable = false) | | | |-- asn: long (nullable = false) | | | |-- asn_org: string (nullable = false) | | | |-- isp: string (nullable = false) | | | |-- organization_name: string (nullable = false) | | | |-- connection_type: string (nullable = false) | | |-- addr: struct (nullable = true) | | | |-- street: string (nullable = false) | | | |-- city: string (nullable = false) | | | |-- state: string (nullable = false) | | | |-- zip: string (nullable = false) | | | |-- country: string (nullable = false) | |-- url: struct (nullable = true) | | |-- url: string (nullable = false) | | |-- host: string (nullable = false) | | |-- domain: string (nullable = false) | | |-- port: integer (nullable = false) | | |-- path: string (nullable = false) | | |-- proto: string (nullable = false) | | |-- params: string (nullable = false) | | |-- param_map: array (nullable = true) | | | |-- element: struct (containsNull = true) | | | | |-- member0: map (nullable = true) | | | | | |-- key: string | | | | | |-- value: struct (valueContainsNull = true) | | | | | | |-- member0: integer (nullable = true) | | | | | | |-- member1: long (nullable = true) | | | | | | |-- member2: string (nullable = true) | | | | | | |-- member3: boolean (nullable = true) | | | | | | |-- member4: float (nullable = true) | | | | | | |-- member5: double (nullable = true) | | | | |-- member1: integer (nullable = true) | | | | |-- member2: long (nullable = true) | | | | |-- member3: string (nullable = true) | | | | |-- member4: boolean (nullable = true) | | | | |-- member5: float (nullable = true) | | | | |-- member6: double (nullable = true) | | |-- query_params: map (nullable = false) | | | |-- key: string | | | |-- value: string (valueContainsNull = false) | | |-- username: string (nullable = false) | | |-- password: string (nullable = false) | | |-- verb: string (nullable = false) | | |-- ip: struct (nullable = true) | | | |-- addr: string (nullable = false) | | | |-- geoip: struct (nullable = true) | | | | |-- latitude: float (nullable = false) | | | | |-- longitude: float (nullable = false) | | | | |-- location: struct (nullable = true) | | | | | |-- lat: float (nullable = false) | | | | | |-- lon: float (nullable = false) | | | | |-- city_name: string (nullable = false) | | | | |-- continent_code: string (nullable = false) | | | | |-- country_code: string (nullable = false) | | | | |-- country_code2: string (nullable = false) | | | | |-- country_code3: string (nullable = false) | | | | |-- country_name: string (nullable = false) | | | | |-- dma_code: integer (nullable = false) | | | | |-- ip: string (nullable = false) | | | | |-- postal_code: string (nullable = false) | | | | |-- region_code: string (nullable = false) | | | | |-- region_name: string (nullable = false) | | | | |-- timezone: string (nullable = false) | | | | |-- asn: long (nullable = false) | | | | |-- asn_org: string (nullable = false) | | | | |-- isp: string (nullable = false) | | | | |-- organization_name: string (nullable = false) | | | | |-- connection_type: string (nullable = false) | | | |-- cidr: string (nullable = false) | | | |-- rdns: struct (nullable = true) | | | | |-- host: string (nullable = false) | | | | |-- domain: string (nullable = false) | | | | |-- sub_domain: string (nullable = false) | | | | |-- hostname: string (nullable = false) | | | | |-- time: timestamp (nullable = true) | | | |-- port: integer (nullable = false) | | | |-- threat_infra: struct (nullable = true) | | | | |-- ids: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | | |-- tags: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | |-- client: struct (nullable = true) | | | | |-- id: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | |-- infection: struct (nullable = true) | | | | |-- id: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | | |-- bot_id: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | | |-- malware: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | |-- direct_collect: struct (nullable = true) | | | | |-- id: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | | |-- source_names: array (nullable = false) | | | | | |-- element: string (containsNull = false) | | | |-- asn_class: struct (nullable = true) | | | | |-- id: long (nullable = false) | | | | |-- org_id: integer (nullable = false) | | | | |-- name: string (nullable = false) | | | | |-- aka: string (nullable = false) | | | | |-- name_long: string (nullable = false) | | | | |-- website: string (nullable = false) | | | | |-- asn: long (nullable = false) | | | | |-- looking_glass: string (nullable = false) | | | | |-- route_server: string (nullable = false) | | | | |-- irr_as_set: string (nullable = false) | | | | |-- info_type: string (nullable = false) | | | | |-- info_prefixes4: long (nullable = false) | | | | |-- info_prefixes6: long (nullable = false) | | | | |-- info_traffic: string (nullable = false) | | | | |-- info_ratio: string (nullable = false) | | | | |-- info_scope: string (nullable = false) | | | | |-- info_unicast: boolean (nullable = false) | | | | |-- info_multicast: boolean (nullable = false) | | | | |-- info_ipv6: boolean (nullable = false) | | | | |-- info_never_via_route_servers: boolean (nullable = false) | | | | |-- ix_count: long (nullable = false) | | | | |-- fac_count: long (nullable = false) | | | | |-- notes: string (nullable = false) | | | | |-- netixlan_updated: timestamp (nullable = true) | | | | |-- netfac_updated: timestamp (nullable = true) | | | | |-- poc_updated: timestamp (nullable = true) | | | | |-- policy_url: string (nullable = false) | | | | |-- policy_general: string (nullable = false) | | | | |-- policy_locations: string (nullable = false) | | | | |-- policy_ratio: boolean (nullable = false) | | | | |-- policy_contracts: string (nullable = false) | | | | |-- netfac_set: array (nullable = false) | | | | | |-- element: struct (containsNull = false) | | | | | | |-- id: long (nullable = false) | | | | | | |-- name: string (nullable = false) | | | | | | |-- city: string (nullable = false) | | | | | | |-- fac_id: long (nullable = false) | | | | | | |-- fac: string (nullable = false) | | | | | | |-- local_asn: long (nullable = false) | | | | | | |-- created: timestamp (nullable = true) | | | | | | |-- updated: timestamp (nullable = true) | | | | | | |-- status: string (nullable = false) | | | | |-- netixlan_set: array (nullable = false) | | | | | |-- element: struct (containsNull = false) | | | | | | |-- id: long (nullable = false) | | | | | | |-- ix_id: string (nullable = false) | | | | | | |-- name: string (nullable = false) | | | | | | |-- ixlan_id: string (nullable = false) | | | | | | |-- ixlan: string (nullable = false) | | | | | | |-- notes: string (nullable = false) | | | | | | |-- speed: long (nullable = false) | | | | | | |-- asn: long (nullable = false) | | | | | | |-- ipaddr4: string (nullable = false) | | | | | | |-- ipaddr6: string (nullable = false) | | | | | | |-- is_rs_peer: boolean (nullable = false) | | | | | | |-- operational: boolean (nullable = false) | | | | | | |-- created: timestamp (nullable = true) | | | | | | |-- updated: timestamp (nullable = true) | | | | | | |-- status: string (nullable = false) | | | | |-- poc_set: array (nullable = false) | | | | | |-- element: struct (containsNull = false) | | | | | | |-- id: long (nullable = false) | | | | | | |-- role: string (nullable = false) | | | | | | |-- visible: string (nullable = false) | | | | | | |-- name: string (nullable = false) | | | | | | |-- phone: string (nullable = false) | | | | | | |-- email: string (nullable = false) | | | | | | |-- url: string (nullable = false) | | | | | | |-- created: timestamp (nullable = true) | | | | | | |-- updated: timestamp (nullable = true) | | | | | | |-- status: string (nullable = false) | | | | |-- allow_ixp_update: boolean (nullable = false) | | | | |-- suggest: boolean (nullable = false) | | | | |-- status_dashboard: string (nullable = false) | | | | |-- rir_status: string (nullable = false) | | | | |-- rir_status_updated: timestamp (nullable = true) | | | | |-- created: timestamp (nullable = true) | | | | |-- updated: timestamp (nullable = true) | | | | |-- status: string (nullable = false) | | | |-- addr_v4: long (nullable = false) | | | |-- addr_v6_host: long (nullable = false) | | | |-- addr_v6_network: long (nullable = false) | |-- clipboard: string (nullable = false) | |-- keylog: string (nullable = false) | |-- keylog_email: array (nullable = false) | | |-- element: string (containsNull = false) |-- doc: string (nullable = false) |-- collection_info: struct (nullable = true) | |-- type: string (nullable = false) | |-- report_type: string (nullable = false) | |-- record_id: string (nullable = false) | |-- bot_id: string (nullable = false) | |-- other_ids: array (nullable = false) | | |-- element: string (containsNull = false) | |-- ingest_time: timestamp (nullable = true) | |-- collection_time: timestamp (nullable = true) | |-- source: string (nullable = false) | |-- sub_source: string (nullable = false) | |-- parent_source: string (nullable = false) | |-- source_name: string (nullable = false) | |-- associated_source_names: array (nullable = false) | | |-- element: string (containsNull = false) | |-- associated_sources: array (nullable = false) | | |-- element: struct (containsNull = false) | | | |-- id: string (nullable = false) | | | |-- descriptor_name: string (nullable = false) | | | |-- class_name: string (nullable = false) | | | |-- index_name: string (nullable = false) | |-- gate_url: array (nullable = false) | | |-- element: struct (containsNull = false) | | | |-- url: string (nullable = false) | | | |-- host: string (nullable = false) | | | |-- domain: string (nullable = false) | | | |-- port: integer (nullable = false) | | | |-- path: string (nullable = false) | | | |-- proto: string (nullable = false) | | | |-- params: string (nullable = false) | | | |-- param_map: array (nullable = true) | | | | |-- element: struct (containsNull = true) | | | | | |-- member0: map (nullable = true) | | | | | | |-- key: string | | | | | | |-- value: struct (valueContainsNull = true) | | | | | | | |-- member0: integer (nullable = true) | | | | | | | |-- member1: long (nullable = true) | | | | | | | |-- member2: string (nullable = true) | | | | | | | |-- member3: boolean (nullable = true) | | | | | | | |-- member4: float (nullable = true) | | | | | | | |-- member5: double (nullable = true) | | | | | |-- member1: integer (nullable = true) | | | | | |-- member2: long (nullable = true) | | | | | |-- member3: string (nullable = true) | | | | | |-- member4: boolean (nullable = true) | | | | | |-- member5: float (nullable = true) | | | | | |-- member6: double (nullable = true) | | | |-- query_params: map (nullable = false) | | | | |-- key: string | | | | |-- value: string (valueContainsNull = false) | | | |-- username: string (nullable = false) | | | |-- password: string (nullable = false) | | | |-- verb: string (nullable = false) | | | |-- ip: struct (nullable = true) | | | | |-- addr: string (nullable = false) | | | | |-- geoip: struct (nullable = true) | | | | | |-- latitude: float (nullable = false) | | | | | |-- longitude: float (nullable = false) | | | | | |-- location: struct (nullable = true) | | | | | | |-- lat: float (nullable = false) | | | | | | |-- lon: float (nullable = false) | | | | | |-- city_name: string (nullable = false) | | | | | |-- continent_code: string (nullable = false) | | | | | |-- country_code: string (nullable = false) | | | | | |-- country_code2: string (nullable = false) | | | | | |-- country_code3: string (nullable = false) | | | | | |-- country_name: string (nullable = false) | | | | | |-- dma_code: integer (nullable = false) | | | | | |-- ip: string (nullable = false) | | | | | |-- postal_code: string (nullable = false) | | | | | |-- region_code: string (nullable = false) | | | | | |-- region_name: string (nullable = false) | | | | | |-- timezone: string (nullable = false) | | | | | |-- asn: long (nullable = false) | | | | | |-- asn_org: string (nullable = false) | | | | | |-- isp: string (nullable = false) | | | | | |-- organization_name: string (nullable = false) | | | | | |-- connection_type: string (nullable = false) | | | | |-- cidr: string (nullable = false) | | | | |-- rdns: struct (nullable = true) | | | | | |-- host: string (nullable = false) | | | | | |-- domain: string (nullable = false) | | | | | |-- sub_domain: string (nullable = false) | | | | | |-- hostname: string (nullable = false) | | | | | |-- time: timestamp (nullable = true) | | | | |-- port: integer (nullable = false) | | | | |-- threat_infra: struct (nullable = true) | | | | | |-- ids: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- tags: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- client: struct (nullable = true) | | | | | |-- id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- infection: struct (nullable = true) | | | | | |-- id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- bot_id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- malware: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- direct_collect: struct (nullable = true) | | | | | |-- id: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | | |-- source_names: array (nullable = false) | | | | | | |-- element: string (containsNull = false) | | | | |-- asn_class: struct (nullable = true) | | | | | |-- id: long (nullable = false) | | | | | |-- org_id: integer (nullable = false) | | | | | |-- name: string (nullable = false) | | | | | |-- aka: string (nullable = false) | | | | | |-- name_long: string (nullable = false) | | | | | |-- website: string (nullable = false) | | | | | |-- asn: long (nullable = false) | | | | | |-- looking_glass: string (nullable = false) | | | | | |-- route_server: string (nullable = false) | | | | | |-- irr_as_set: string (nullable = false) | | | | | |-- info_type: string (nullable = false) | | | | | |-- info_prefixes4: long (nullable = false) | | | | | |-- info_prefixes6: long (nullable = false) | | | | | |-- info_traffic: string (nullable = false) | | | | | |-- info_ratio: string (nullable = false) | | | | | |-- info_scope: string (nullable = false) | | | | | |-- info_unicast: boolean (nullable = false) | | | | | |-- info_multicast: boolean (nullable = false) | | | | | |-- info_ipv6: boolean (nullable = false) | | | | | |-- info_never_via_route_servers: boolean (nullable = false) | | | | | |-- ix_count: long (nullable = false) | | | | | |-- fac_count: long (nullable = false) | | | | | |-- notes: string (nullable = false) | | | | | |-- netixlan_updated: timestamp (nullable = true) | | | | | |-- netfac_updated: timestamp (nullable = true) | | | | | |-- poc_updated: timestamp (nullable = true) | | | | | |-- policy_url: string (nullable = false) | | | | | |-- policy_general: string (nullable = false) | | | | | |-- policy_locations: string (nullable = false) | | | | | |-- policy_ratio: boolean (nullable = false) | | | | | |-- policy_contracts: string (nullable = false) | | | | | |-- netfac_set: array (nullable = false) | | | | | | |-- element: struct (containsNull = false) | | | | | | | |-- id: long (nullable = false) | | | | | | | |-- name: string (nullable = false) | | | | | | | |-- city: string (nullable = false) | | | | | | | |-- fac_id: long (nullable = false) | | | | | | | |-- fac: string (nullable = false) | | | | | | | |-- local_asn: long (nullable = false) | | | | | | | |-- created: timestamp (nullable = true) | | | | | | | |-- updated: timestamp (nullable = true) | | | | | | | |-- status: string (nullable = false) | | | | | |-- netixlan_set: array (nullable = false) | | | | | | |-- element: struct (containsNull = false) | | | | | | | |-- id: long (nullable = false) | | | | | | | |-- ix_id: string (nullable = false) | | | | | | | |-- name: string (nullable = false) | | | | | | | |-- ixlan_id: string (nullable = false) | | | | | | | |-- ixlan: string (nullable = false) | | | | | | | |-- notes: string (nullable = false) | | | | | | | |-- speed: long (nullable = false) | | | | | | | |-- asn: long (nullable = false) | | | | | | | |-- ipaddr4: string (nullable = false) | | | | | | | |-- ipaddr6: string (nullable = false) | | | | | | | |-- is_rs_peer: boolean (nullable = false) | | | | | | | |-- operational: boolean (nullable = false) | | | | | | | |-- created: timestamp (nullable = true) | | | | | | | |-- updated: timestamp (nullable = true) | | | | | | | |-- status: string (nullable = false) | | | | | |-- poc_set: array (nullable = false) | | | | | | |-- element: struct (containsNull = false) | | | | | | | |-- id: long (nullable = false) | | | | | | | |-- role: string (nullable = false) | | | | | | | |-- visible: string (nullable = false) | | | | | | | |-- name: string (nullable = false) | | | | | | | |-- phone: string (nullable = false) | | | | | | | |-- email: string (nullable = false) | | | | | | | |-- url: string (nullable = false) | | | | | | | |-- created: timestamp (nullable = true) | | | | | | | |-- updated: timestamp (nullable = true) | | | | | | | |-- status: string (nullable = false) | | | | | |-- allow_ixp_update: boolean (nullable = false) | | | | | |-- suggest: boolean (nullable = false) | | | | | |-- status_dashboard: string (nullable = false) | | | | | |-- rir_status: string (nullable = false) | | | | | |-- rir_status_updated: timestamp (nullable = true) | | | | | |-- created: timestamp (nullable = true) | | | | | |-- updated: timestamp (nullable = true) | | | | | |-- status: string (nullable = false) | | | | |-- addr_v4: long (nullable = false) | | | | |-- addr_v6_host: long (nullable = false) | | | | |-- addr_v6_network: long (nullable = false) | |-- malware_type: string (nullable = false) | |-- data_source_url: string (nullable = false) | |-- data_source_ip: string (nullable = false) | |-- comments: string (nullable = false) | |-- raw_source_file: string (nullable = false) | |-- load_id: string (nullable = false) | |-- collection_metadata: map (nullable = false) | | |-- key: string | | |-- value: string (valueContainsNull = false) | |-- collection_data: array (nullable = true) | | |-- element: struct (containsNull = true) | | | |-- member0: map (nullable = true) | | | | |-- key: string | | | | |-- value: struct (valueContainsNull = true) | | | | | |-- member0: integer (nullable = true) | | | | | |-- member1: long (nullable = true) | | | | | |-- member2: string (nullable = true) | | | | | |-- member3: boolean (nullable = true) | | | | | |-- member4: float (nullable = true) | | | | | |-- member5: double (nullable = true) | | | |-- member1: integer (nullable = true) | | | |-- member2: long (nullable = true) | | | |-- member3: string (nullable = true) | | | |-- member4: boolean (nullable = true) | | | |-- member5: float (nullable = true) | | | |-- member6: double (nullable = true) | |-- is_sensitive: boolean (nullable = false) | |-- parent_ref: struct (nullable = true) | | |-- id: string (nullable = false) | | |-- descriptor_name: string (nullable = false) | | |-- class_name: string (nullable = false) | | |-- index_name: string (nullable = false) | |-- target_id: string (nullable = false) |-- tags: array (nullable = false) | |-- element: string (containsNull = false) |-- updated_at: timestamp (nullable = true) |-- key: binary (nullable = true) |-- topic: string (nullable = true) |-- messageId: binary (nullable = true) |-- publishTime: timestamp (nullable = true) |-- eventTime: timestamp (nullable = true) |-- messageProperties: map (nullable = true) | |-- key: string | |-- value: string (valueContainsNull = true)

danny0405 commented 1 week ago

java.lang.ClassCastException: class org.apache.spark.sql.types.StructType cannot be cast to class org.apache.spark.sql.types.MapType

It looks like the schema of the input data_frame is inconsistent with the table creation schema, did you check that?

ad1happy2go commented 12 hours ago

@Pavan792reddy Were you able to resolve it?