Azure / spark-cdm-connector

MIT License
75 stars 32 forks source link

Error while reading CDM Entity previously created from ADF #65

Closed oxsmose closed 3 years ago

oxsmose commented 3 years ago

We've got the error java.util.NoSuchElementException: No value found for 'Unknown' when we try to read an entity previously created by ADF.

/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options) 170 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) 171 else: --> 172 return self._df(self._jreader.load()) 173 174 @since(1.4)

/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in call(self, *args) 1255 answer = self.gateway_client.send_command(command) 1256 return_value = get_return_value( -> 1257 answer, self.gateway_client, self.target_id, self.name) 1258 1259 for temp_arg in temp_args:

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, kw) 61 def deco(*a, *kw): 62 try: ---> 63 return f(a, kw) 64 except py4j.protocol.Py4JJavaError as e: 65 s = e.java_exception.toString()

/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". --> 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError(

Py4JJavaError: An error occurred while calling o5124.load. : java.util.NoSuchElementException: No value found for 'Unknown' at scala.Enumeration.withName(Enumeration.scala:124) at com.microsoft.cdm.utils.CDMModelReader$$anonfun$2.apply(CDMModelReader.scala:72) at com.microsoft.cdm.utils.CDMModelReader$$anonfun$2.apply(CDMModelReader.scala:40) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at com.microsoft.cdm.utils.CDMModelReader.getStructType(CDMModelReader.scala:40) at com.microsoft.cdm.utils.CDMModelReader.getSchema(CDMModelReader.scala:98) at com.microsoft.cdm.read.CDMDataSourceReader$$anonfun$readSchema$1.apply(CDMDataSourceReader.scala:45) at com.microsoft.cdm.read.CDMDataSourceReader$$anonfun$readSchema$1.apply(CDMDataSourceReader.scala:45) at com.microsoft.cdm.log.SparkCDMLogger$.logEventToKustoForPerf(SparkCDMLogger.scala:43) at com.microsoft.cdm.read.CDMDataSourceReader.readSchema(CDMDataSourceReader.scala:44) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.create(DataSourceV2Relation.scala:175) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:290) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:203) at sun.reflect.GeneratedMethodAccessor339.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at py4j.Gateway.invoke(Gateway.java:295) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:251) at java.lang.Thread.run(Thread.java:748)

bissont commented 3 years ago

Hello, I believe occurs because ADF assumes an attribute of type "Unknown" is a string. The Spark-CDM-Connector does not make that assumption.

bissont commented 3 years ago

I believe this is fixed .19. Can you verify it is fixed?

oxsmose commented 3 years ago

I will check, thanks

bitsofinfo commented 3 years ago

@oxsmose did the fix work for you?