NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
53 stars 37 forks source link

[FEA] Profiling tool should work on partial event logs #360

Open tgravescs opened 1 year ago

tgravescs commented 1 year ago

Is your feature request related to a problem? Please describe. For 24/7 type clusters, the event logs can be huge so loading everything in Profiling tool is impossible. It would be nice to allow it to work with partial event logs where it might just be a smaller period of time like an hour.

This is usually combined with eventlog rolling, like every hour or so

tgravescs commented 1 year ago

Note, if I try this now I get:


23/06/01 13:22:31 WARN Profiler: Exception occurred processing file: eventlog-2023-06-01--12-00
java.lang.NullPointerException
        at com.nvidia.spark.rapids.tool.profiling.CollectInformation.$anonfun$getAppInfo$1(CollectInformation.scala:38)
        at scala.collection.immutable.List.map(List.scala:293)
        at com.nvidia.spark.rapids.tool.profiling.CollectInformation.getAppInfo(CollectInformation.scala:36)
        at com.nvidia.spark.rapids.tool.profiling.Profiler.com$nvidia$spark$rapids$tool$profiling$Profiler$$processApps(Profiler.scala:288)
        at com.nvidia.spark.rapids.tool.profiling.Profiler$ProfileProcessThread$1.run(Profiler.scala:230)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:75
```0)
amahussein commented 1 year ago

I expect that to be tricky.

There are some issues that could be related to that as well: