Currently, the number of stages of a DAG computation graph is extracted when a pyspark pipeline'sdescribe method is called. This may take an unreasonable amount of time for large computation graphs and pipelines with many stages. There should be a parameter like dag_stage_count or similar to activate the corresponding computation. By default, it should be deactivated.
Currently, the number of stages of a DAG computation graph is extracted when a pyspark pipeline's describe method is called. This may take an unreasonable amount of time for large computation graphs and pipelines with many stages. There should be a parameter like
dag_stage_count
or similar to activate the corresponding computation. By default, it should be deactivated.