Closed vineelavelagapudi closed 6 years ago
Hi @vineelavelagapudi
As the step 1 in https://github.com/hortonworks-spark/shc/wiki/2.-Native-Avro-Support shows, when you writing data, you should use "type":"binary"
.
Thank you @weiqingy for your response but still getting the same error. I tried in scala val schemaString = s"""{"namespace": "doctors.avro", "type": "record", "name": "user", "fields": [
{"name":"first_name", "type":"string"},
{"name":"last_name", "type":"string"},
{"name":"number", "type":"int"}
] }""".stripMargin
def catalog= s"""{ "table": {"namespace": "default","name": "trip_avro"}, "rowkey": "key1", "columns": { "col1": {"cf": "rowkey","col": "key1","type": "int"}, "col2": {"cf": "sales","col": "col2","type":"binary"}
}
}""".stripMargin
def avroCatalog = s"""{ "table": {"namespace": "default","name": "trip_avro","tableCoder":"avro"}, "rowkey": "key1", "columns": { "col1": {"cf": "rowkey","col": "key1","type": "int"}, "col2": {"cf": "sales","col": "col2","avro":"avroSchema"}
}
}""".stripMargin
import org.apache.avro.Schema
val avroSchema: Schema = { val p = new Schema.Parser p.parse(schemaString) }
import org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog
val df = spark.read.format("com.databricks.spark.avro").options(Map("avroSchema" -> schemaString,HBaseTableCatalog.tableCatalog -> catalog,HBaseTableCatalog.tableCatalog -> avroCatalog)).load("path-of-file")
df.write.options(Map("avroSchema" -> schemaString,HBaseTableCatalog.tableCatalog-> catalog, HBaseTableCatalog.tableCatalog-> avroCatalog,"newTable" -> "5")).format("org.apache.spark.sql.execution.datasources.hbase").save()
java.lang.ClassNotFoundException: avro
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.spark.sql.execution.datasources.hbase.types.SHCDataTypeFactory$.create(SHCDataType.scala:99)
at org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog.
df = spark.read.format("com.databricks.spark.avro").option("schema1",avroSchema).load("path_of_file")
avroSchema = {"namespace": "doctors.avro", "type": "record", "name": "user", "fields": [
hbaseschema3= { "table": {"namespace": "default","name": "trip_avro","tableCoder":"avro"}, "rowkey": "key1", "columns": { "col1": {"cf": "rowkey","col": "key1","type": "int"}, "col2": {"cf": "sales","col": "fields", "avro":"avroSchema"} } }
df.write.option("catalog",hbaseschema3).option("newtable","5").option("avroSchema",avroSchema).format("org.apache.spark.sql.execution.datasources.hbase").save()
Tried with above code but getting error as 'java.lang.ClassNotFoundException: avro ' by referring https://github.com/hortonworks-spark/shc/wiki/2.-Native-Avro-Support Can anyone give solution?