qubole / sparklens

Qubole Sparklens tool for performance tuning Apache Spark
http://sparklens.qubole.com
Apache License 2.0
568 stars 138 forks source link

Email report generation is not working #59

Open mdmohsin2005khan opened 4 years ago

mdmohsin2005khan commented 4 years ago

I have seen 2 issues with email report generation

  1. email having dot(.) is considered as invalid, example: firstname.lastname@gmail.com (my office email has this pattern & when I tried generating email, I was getting invalid email in yarn logs). Tried validating this regex https://github.com/qubole/sparklens/blob/7fa57b9606abaace2ff7459028a6bf5a68fd5fa9/src/main/scala/com/qubole/sparklens/helper/EmailReportHelper.scala#L15 through https://regex101.com/ for above mentioned email and I'm getting full match as lastname@gmail.com
email_regex
  1. I tried generating with regular email id (without having any dots in email) and even then I was not getting any email reports.
bharathjs93 commented 4 years ago

Same with me..

mayurdb commented 4 years ago

Hi, thanks for reporting the issues.

I have created the PR for the fix: https://github.com/qubole/sparklens/pull/60

It would be great if you can confirm if the new regex covers the cases. Since many users must be facing an issue because of this, we will do a new release in the coming days. In the meantime, you can build the package from the source once this is merged: sbt clean assembly

bharathjs93 commented 4 years ago

I see another issue where I'm unable to save Sparklens JSON file to S3. It gets saved only to HDFS.

mdmohsin2005khan commented 4 years ago

@bharathjs93

I tried giving s3a instead of just s3 and it worked for me, sparklens was writing the metrics to s3 location.

Example: --conf spark.sparklens.data.dir=s3a://bucket-name/prefixes/sparklens/