target / data-validator

A tool to validate data, built around Apache Spark.
Other
100 stars 34 forks source link

Attempt to send email should be retried if it fails #70

Open dbseraf1 opened 2 years ago

dbseraf1 commented 2 years ago

Currently, if sending email fails because the email server is temporarily offline or overloaded, the only choice of action is to rerun the whole validation. This can be very expensive, and may require manual intervention if the program is running as part of an automatic workflow.

It would be better if the program detected the error in sending email and did its own wait-and-retry loop. This would be pretty cheap and much better than failing.

colindean commented 2 years ago

Great suggestion.

Quick notes for myself or whoever may pick this up:

Add config for the retry here:

https://github.com/target/data-validator/blob/c17ec6a92bc8f7dadab88a1a6ff471295a7efd0d/src/main/scala/com/target/data_validator/Emailer.scala#L11-L18

Consume the config and handle the retry here:

https://github.com/target/data-validator/blob/c17ec6a92bc8f7dadab88a1a6ff471295a7efd0d/src/main/scala/com/target/data_validator/Emailer.scala#L109-L126

And document it in README.md here:

https://github.com/target/data-validator/blob/c17ec6a92bc8f7dadab88a1a6ff471295a7efd0d/README.md?plain=1#L74-L83

dougb commented 2 years ago

@dbseraf1 You can save the output to a file or hdfs so you don't lose the results of the run if it fails to send the email.

If you can't wait for this issue to be resolved, you could also use the Pipe output option to specify a program to run which can read the jsonReport from stdin and send an email.