qubole / sparklens

Qubole Sparklens tool for performance tuning Apache Spark
http://sparklens.qubole.com
Apache License 2.0
568 stars 138 forks source link

sparklens datadirectory not found #49

Closed normalscene closed 4 years ago

normalscene commented 4 years ago

Hi There,

We are running spark-lens with the application itself, and we also want to check the offline directory to generate JSON but there is no directory, named /tmp/sparklens, found after the application has finished.

Not able to understand why is this happening OR are we missing some configuration. Could you please give us some pointers here please?

Thanks, Gaurav

normalscene commented 4 years ago

Hello @iamrohit & @mayurdb ,

Can anyone assist us to spot the issue? We would be very grateful. Thanks in advance !

mayurdb commented 4 years ago

@normalscene what is the deployment you are using. If its a cluster mode, the result won't be available in the local fileSystem as the driver is not running locally. It would be best if you could just give spark.sparklens.data.dir as some S3 location.

Also, it would be great if you could let us know a bit more about your organization, use-cases, and gaps that you see in Sparklens we can improve upon.

normalscene commented 4 years ago

@mayurdb

We don't have S3 buckets but let me try if it accepts a gs bucket (I hope it must accept).

I work at figmd, a healthcare data analytics company. We are currently trying out spark-lens specifically to check our clusters' performance and resource utilization. As we go along, i will be sure to let you know if we spot any issues or gaps in spark-lens.

normalscene commented 4 years ago

We were able to resolve the issue just by giving gs bucket path and there was no other config change required. closing the issue.