Open thambi1981 opened 1 year ago
I do see extra key in the progress document. but I don't see any reserved field to be displayed in the UI. It's not only Automic/airflow job name
Not sure I understand your question correctly, but the execution plan name (aka application name, or job name) in provided by an agent in the optional string property name
, and then stored under the same property in the executionPlan
collection in the ArangoDB.
On the UI, when talking about the execution event list which is formed from the items stored in the progress
collection, the application name is displayed in the leftmost column, and is taken from the respective JSON property returned by the Consumer REST API.
In the database, the application name is additionally stored under the execPlanDetails
property of the 'progress' items. That is pure optimization to avoid extra traversals over the progressOf
edge.
Everything that is under extra
property is a black box for Spline. The extra
property is there to store any additional metadata that might be used for some custom user queries. Spline itself doesn't know the meaning of what is stored under extra
and doesn't touch it at all. The most we could do is just to display it somewhere on the UI, and probably search in it. But it certainly isn't needed for displaying the application name, as for that there is a reserved property, as explained above.
Not sure if I answered your question.
I'm not sure about the question either but we are planning to use the Airflow task name as the spark app name to achieve this. Basically we will be calling spark like this:
spark = SparkSession.builder.appName(airflow-task-name).getOrCreate()
.
You should be able to pass the name of Airflow task to your module though.
Hope this helps.
Background [Optional]
We have requirement to capture the job name ( Airflow job or Automic job or any scheduler ) which is submitted . Right now we capture the spark application name which shows in the
progress
collection againstapplicationName
key. Every spark job would have submitted by scheduler ( airflow or Automic). We would like to captures this information as the top level source and the show the sparkapplicationName
. I do seeextra
key in theprogress
document. but I don't see any reserved field to be displayed in the UI. It's not only Automic/airflow job name . In future We would like to also add some additional information when we progress further. Please give some insights what's the right way showing this non spark information. Also it using the same job name we would like to search in UI.Reason for this requirement to identify the spark job is submitted from which scheduler job. By adding this feature more users can adopt
Question
Please give some insights what's the right way showing non spark job level information and how to make that job is searchable