Azure / feast-azure

Azure plugins for Feast (FEAture STore)
MIT License
81 stars 52 forks source link

offline_to_online_ingestion_job fails adding ingestion JAR to distributed cache #46

Closed andrijaperovic closed 2 years ago

andrijaperovic commented 2 years ago

Encountering this error in the Livy logs of Synapse when running offline_to_online_ingestion job:

Exception in thread "main" java.lang.IllegalArgumentException: Attempt to add (abfss://feastpoc@feastpoc.dfs.core.windows.net/feast/feast-ingestion-spark-latest.jar) multiple times to the distributed cache.
    at org.apache.spark.deploy.yarn.Client.$anonfun$prepareLocalResources$23(Client.scala:633)
    at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
    at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
    at org.apache.spark.deploy.yarn.Client.$anonfun$prepareLocalResources$22(Client.scala:623)
    at org.apache.spark.deploy.yarn.Client.$anonfun$prepareLocalResources$22$adapted(Client.scala:622)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:622)
    at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:913)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:204)
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1261)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1677)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:956)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:204)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1044)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1053)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Setting FEAST_SPARK_INGESTION_JAR environment variable using SAS-based connection string (https://feastdrivingpoc.blob.core.windows.net/feastjar/feast-ingestion-spark-latest.jar plus SAS signature), since public access is not allowed on the storage account by default:

job = feast_spark.Client(client).start_offline_to_online_ingestion(
File "/Users/i868602/.local/lib/python3.8/site-packages/feast_spark/client.py", line 299, in start_offline_to_online_ingestion
return start_offline_to_online_ingestion(
File "/Users/i868602/.local/lib/python3.8/site-packages/feast_spark/pyspark/launcher.py", line 352, in start_offline_to_online_ingestion
return launcher.offline_to_online_ingestion(
File "/Users/i868602/.local/lib/python3.8/site-packages/feast_spark/pyspark/launchers/synapse/synapse.py", line 240, in offline_to_online_ingestion
main_file = self._datalake.upload_file(ingestion_job_params.get_main_file_path())
File "/Users/i868602/.local/lib/python3.8/site-packages/feast_spark/pyspark/launchers/synapse/synapse_utils.py", line 229, in upload_file
with urllib.request.urlopen(local_file) as f:
File "/usr/local/anaconda3/envs/feast_poc/lib/python3.8/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/anaconda3/envs/feast_poc/lib/python3.8/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/local/anaconda3/envs/feast_poc/lib/python3.8/urllib/request.py", line 641, in http_response
response = self.parent.error(
File "/usr/local/anaconda3/envs/feast_poc/lib/python3.8/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/local/anaconda3/envs/feast_poc/lib/python3.8/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/usr/local/anaconda3/envs/feast_poc/lib/python3.8/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 409: Public access is not permitted on this storage account.

Looks like this might be more of spark problem which has to do with duplicate JAR's in the Spark2 directory.

andrijaperovic commented 2 years ago

@rramani @snowmanmsft