Azure-Samples / cdm-azure-data-services-integration

Tutorials and sample code for integrating CDM folders with Azure Data Services
MIT License
70 stars 46 forks source link

java.util.NoSuchElementException: No value found for 'date' #20

Closed datalord123 closed 4 years ago

datalord123 commented 4 years ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run read-write-demo-wide-world-importers.py

Try to create the data-frame for

"Sales Orders" and/or "Sales Customers"

Any log messages given by the failure

Py4JJavaError Traceback (most recent call last)

in 4 .option("appId", appID) 5 .option("appKey", appKey) ----> 6 .option("tenantId", tenantID) 7 .load())

/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options) 170 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) 171 else: --> 172 return self._df(self._jreader.load()) 173 174 @since(1.4)

/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in call(self, *args) 1255 answer = self.gateway_client.send_command(command) 1256 return_value = get_return_value( -> 1257 answer, self.gateway_client, self.target_id, self.name) 1258 1259 for temp_arg in temp_args:

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, kw) 61 def deco(*a, *kw): 62 try: ---> 63 return f(a, kw) 64 except py4j.protocol.Py4JJavaError as e: 65 s = e.java_exception.toString()

/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". --> 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError(

Py4JJavaError: An error occurred while calling o285.load. : java.util.NoSuchElementException: No value found for 'date' at scala.Enumeration.withName(Enumeration.scala:124) at com.microsoft.cdm.utils.CDMModel$$anonfun$schema$1.apply(CDMModel.scala:34) at com.microsoft.cdm.utils.CDMModel$$anonfun$schema$1.apply(CDMModel.scala:30) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310) at scala.collection.AbstractIterator.to(Iterator.scala:1334) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289) at scala.collection.AbstractIterator.toArray(Iterator.scala:1334) at com.microsoft.cdm.utils.CDMModel.schema(CDMModel.scala:35) at com.microsoft.cdm.read.CDMDataSourceReader.readSchema(CDMDataSourceReader.scala:28) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.create(DataSourceV2Relation.scala:175) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:290) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:203) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at py4j.Gateway.invoke(Gateway.java:295) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:251) at java.lang.Thread.run(Thread.java:748)

Expected/desired behavior

The data get parsed correctly and the notebook runs without errors

OS and Version?

Windows 10

Versions

DBR 5.5 Conda Beta Spark 2.4.3, Scala 2.11

Mention any other details that might be useful

I tried changing the date time columns in the entity data sets to just date because i thought it might be a translation issue from the cdm model, but refreshing after doing that didn't change anything. I'm at a complete loss as to why this is happening, and why it is only affecting those 2 dataframes.

Every other dataframe in the query runs perfectly. It's literally just those two entities "Sales Orders" and/or "Sales Customers"

does anybody know what the fix is?


Thanks! We'll be in touch soon.

datalord123 commented 4 years ago

The issue is that databricks doesn't read in Date types. In powerBI those two tables include both datetimes and date's as types. All of the date columns have to be datetimes for it to work.