Did you read the pinned issues and search the error message?
Yes, but I didn't find the answer.
Summary of issue
We have several tables to be ingested using the notebook, they will run in paralle with read operaion. And some tables of them will fail everytime and different tables failed at different runs.
Rerun will work, but it will fail again next time. There is no problem before, but some tables will fail from few days before, without modificaion.
The issue is reading parallelly using the same manifestPath , not have writing parallelly operation.
in
----> 1 df = (spark.read.format("com.microsoft.cdm")
2 .option("storage", storagePath)
3 .option("manifestPath", sourceFileSystem + "/model.json")
4 .option("entity", entity)
5 .option("appId", appId)
/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
208 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
209 else:
--> 210 return self._df(self._jreader.load())
211
212 def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,
/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
1302
1303 answer = self.gateway_client.send_command(command)
-> 1304 return_value = get_return_value(
1305 answer, self.gateway_client, self.target_id, self.name)
1306
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
121 # Hide where the exception came from that shows a non-Pythonic
122 # JVM exception message.
--> 123 raise converted from None
124 else:
125 raise
### Error stack trace
_No response_
### Platform name
Azure Databricks
### Spark version
3.1.2
### CDM jar version
1.19.2
### What is the format of the data you are trying to read/write?
.csv
Did you read the pinned issues and search the error message?
Yes, but I didn't find the answer.
Summary of issue
We have several tables to be ingested using the notebook, they will run in paralle with read operaion. And some tables of them will fail everytime and different tables failed at different runs.
Rerun will work, but it will fail again next time. There is no problem before, but some tables will fail from few days before, without modificaion. The issue is reading parallelly using the same manifestPath , not have writing parallelly operation.
cluster DBR verion: 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)
The error message shows "AnalysisException: Manifest doesn't exist: model.json":
AnalysisException Traceback (most recent call last)