springml / spark-salesforce

Spark data source for Salesforce
Apache License 2.0
80 stars 67 forks source link

Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Invalid UTF-8 start byte 0x92 #25

Open chetankumart opened 6 years ago

chetankumart commented 6 years ago

Hi, I'm getting an error when using 'Reading Salesforce Object' API. In fact, I also specified another option to mention charsetName as 'UTF-8' like ".option("charsetName", "UTF-8")" in below API but it also didn't help when executing from Spark-Shell in the windows version. It works well in eclipse spark ide.

API Used

val sfDF = spark. read. format("com.springml.spark.salesforce"). option("username", "your_salesforce_username"). option("password", "your_salesforce_password_with_secutiry_token"). option("soql", soql). option("version", "37.0"). load()

Exception

Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Invalid UTF-8 start byte 0x92 at [Source: [B@12a470dd; line: 1, column: 19852] (through reference chain: com.springml.salesforce.wave.model.SOQLResult["records"]->java.util.ArrayList[17]) at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:210) at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:189) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:266) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:217) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:25) at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:520) at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:95) at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:258) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:125) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2819) at com.springml.salesforce.wave.impl.ForceAPIImpl.query(ForceAPIImpl.java:124) at com.springml.salesforce.wave.impl.ForceAPIImpl.query(ForceAPIImpl.java:36) at com.springml.spark.salesforce.DatasetRelation.querySF(DatasetRelation.scala:117) at com.springml.spark.salesforce.DatasetRelation.read(DatasetRelation.scala:59) at com.springml.spark.salesforce.DatasetRelation.(DatasetRelation.scala:51) at com.springml.spark.salesforce.DefaultSource.createRelation(DefaultSource.scala:92) at com.springml.spark.salesforce.DefaultSource.createRelation(DefaultSource.scala:54) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at sfdc.idp.integration.SFDCConnect$.main(SFDCConnect.scala:55) at sfdc.idp.integration.SFDCConnect.main(SFDCConnect.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0x92 at [Source: [B@12a470dd; line: 1, column: 19852] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3467) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:3461) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2488) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2414) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:285) at com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:514) at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringMap(MapDeserializer.java:495) at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:341) at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:26) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:245) ... 29 more

samuel-pt commented 6 years ago

@chetankumart what is the charset of the data being fetched from Salesforce?

chetankumart commented 6 years ago

@samuel-pt I believe it is UTF-8 charset.

I see a special character coming from a Field data as a single quote as shown below. image

chetankumart commented 6 years ago

@samuel-pt do you have any update?

joykrishna commented 6 years ago

Hi,

We are also facing the same issue when reading the data from Salesforce object. In our case, I suppose the source encoding format is ISO_8859_1. However we see below error while trying to load the data using SOQL:

18/08/07 10:54:52 WARN ForceAPIImpl: Error while executing salesforce query com.fasterxml.jackson.databind.JsonMappingException: Invalid UTF-8 start byte 0xfc at [Source: [B@52ff99cd; line: 1, column: 21275] (through reference chain: com.springml.salesforce.wave.model.SOQLResult["records"]->java.util.ArrayList[8]) at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:210) at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:189) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:266) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:217) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:25) at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:520) at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:95) at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:258) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:125) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2819) at com.springml.salesforce.wave.impl.ForceAPIImpl.query(ForceAPIImpl.java:124) at com.springml.salesforce.wave.impl.ForceAPIImpl.query(ForceAPIImpl.java:36) at com.springml.spark.salesforce.DatasetRelation.querySF(DatasetRelation.scala:117) at com.springml.spark.salesforce.DatasetRelation.read(DatasetRelation.scala:59) at com.springml.spark.salesforce.DatasetRelation.(DatasetRelation.scala:51) at com.springml.spark.salesforce.DefaultSource.createRelation(DefaultSource.scala:92) at com.springml.spark.salesforce.DefaultSource.createRelation(DefaultSource.scala:54) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at com.cisco.sample.util.DataLoadUtil.fetchCSOneDataUsingDFReader(DataLoadUtil.java:50) at com.cisco.sample.util.DataLoadUtil.main(DataLoadUtil.java:58) Caused by: com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xfc at [Source: [B@52ff99cd; line: 1, column: 21275] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3467) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:3461) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2488) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2414) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:285) at com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:514) at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringMap(MapDeserializer.java:495) at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:341) at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:26) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:245) ... 21 more

Please be informed that we wouldn't be able to change source encoding to UTF-8. Kindly suggest in case of any workarounds for this issue.

Thanks, Jaya Krishna

GobinathIntellectyx commented 3 years ago

I am also facing the same issue while trying to connect with the salesforce. Please find us a workaround for this issue ASAP.