Logging enhancement - file cleanup and email on action

geneorama commented 7 years ago

The script in the crontab runs every hour and puts results into ~/wnv-pred-logs

ModelPath=/app/<service_user>/WNV_model/
15 * * * * cd $ModelPath && ./R/run_all.sh &> ~/wnv-pred-logs/run_$(date --rfc-3339=seconds).log

We need

Automatic file cleanup.
To make the logs more useful. The data portal is probably updated once a week, so 167 / 168 times the log has no meaningful information because the data didn't change.

Some ideas to make the logs more useful:

Use the R script to write the log file rather than the crontab, this would make it easier to tell if an action was triggered due to a change in the data.
Use different filenames if the data changed and something happened
Email when the data changes
Email when the process completes successfully

@tomschenkjr What do you think about this list / the distribution of results? Should we set up an email that is similar to the developers email?

tomschenkjr commented 7 years ago

Assigning this to the Operationalize milestone and reopening that milestone.

tomschenkjr commented 7 years ago

@geneorama - right now, let's keep the list to you. We'll change it to a broader list down the road.

geneorama commented 7 years ago

Added the email capability. Right now it's emailing me when "nothing happens" and when "something happens". I'm going to turn of the "nothing happens" tomorrow if it works all night.

geneorama commented 7 years ago

I've updated the scripts to

Only notify if there's an update
Provide the path of the file that's sending the email (it would be annoying to get email from an unidentified source, if we hook this up to an bulk email)

I added code that would identify the full path including filename, but it doesn't work when the code is called from crontab. So, I just used getwd to at least announce the directory of the script in the email.

tomschenkjr commented 7 years ago

You'll want the "nothing happens" email as well, but differentiate it in the subject line. Otherwise, it'll be really easy to miss when an issue arises that leads to "nothing happens". The distinction in the subject line will let you control how to monitor it using inbox rules.

geneorama commented 7 years ago

ok, it's updated. Also created a rule in Outlook to automatically move the flie.

Wasn't a big change, just had to uncomment the last line that was already there. I made the subjects conditional when I was doing the cleanup yesterday.

https://github.com/Chicago/WNV_model/blob/rversion/R/run_all.R#L35

msg <- paste0("'Email from ", thisfile, "\n",
              "No change in ", url, "\n",
              "Predictions not updated'")
subj <- "'WNV PREDICTIONS - NO UPDATE'"
system(paste("echo", msg, "| mail -s", subj, " gene.leynes@cityofchicago.org"))

Testing in 1 mintue...

geneorama commented 7 years ago

It's working.

tomschenkjr commented 7 years ago

Reopening the issue given what we discussed. For documentation purposes, you'll want the script to run so one of the two outcomes are apparent:

Failure - the script failed to fully execute and run to completion.
Success - the script ran until completion. This email is sent as success regardless if data is uploaded, but just whether the script completed.

geneorama commented 7 years ago

ok, working on that now.

geneorama commented 7 years ago

I used try statements to create conditional email messages, the run script is pretty complicated now but it seems to work. The update / no update indicates whether the data changed on the data portal. The success / failure indicates if the result of the try inherits the class "try-error" (R's roundabout way of letting us know that an error happened).

Possible email subjects: WNV PREDICTIONS - FAILURE - NO UPDATE WNV PREDICTIONS - SUCCESS - NO UPDATE WNV PREDICTIONS - SUCCESS - UPDATED WNV PREDICTIONS - FAILURE - UPDATED

Examples of body of email: Example with no update. Note that in this case none of the dependent scripts were called and so none of them failed. The error happened in a function in the body of "run_all.R", and the failure is indicated in the subject line only.

Email from C:/Users/375492/GitHub/WNV/WNV_model/run_all.R
No change in https://data.cityofchicago.org/api/views/jqe8-8r6s/rows.csv?accessType=DOWNLOAD
Predictions not updated

Example with an update:

Email from C:/Users/375492/GitHub/WNV/WNV_model/run_all.R
Data changed on https://data.cityofchicago.org/api/views/jqe8-8r6s/rows.csv?accessType=DOWNLOAD
WNV Prediction update triggered

SUCCESS - R/10_calculate_idtable.R
FAIL - R/21_create_features.R
SUCCESS - R/40_upload_predictions_ROracle.R

Each script is prefixed with its status.

geneorama commented 7 years ago

You can see the monstrosity here: https://github.com/Chicago/WNV_model/blob/rversion/R/run_all.R

tomschenkjr commented 7 years ago

Go ahead and add me and Nick to the updates for now.

geneorama commented 7 years ago

While adding Nick and Tom to the email list I noticed an issue this morning:

The R/run_all.R creates some variables (like the email message). These variables were getting deleted by the scripts that create the features and retrain the model, because I was starting those scripts with rm(list=ls()). I fixed this issue, and added the other emails to the R/run_all.R script.

The script runs at 15 past the hour, so I'll be able to see if it's working shortly.

geneorama commented 7 years ago

BTW, the email script was working when there was no update, because none of the feature generating / etc scripts got called. I'm going to test it once as normal with no update, then force it to update (by deleting the "digest", thus making it appear as if the data changed) and make sure the email works when the scripts run.

I really should put the scripts into functions now that the development is stable.

tomschenkjr commented 7 years ago

Can you provide an update on this?

tomschenkjr commented 7 years ago

@geneorama - also, please provide an update on the remaining steps you're taking with this task and any further modifications you're hoping to make before closing the issue.

Also keep in mind that this milestone is 22 days overdue.

geneorama commented 7 years ago

If you're happy with the notifications then we're done. I figured we'd close the issue in the next analytics meeting.

From the comments above:

I used try statements to create conditional email messages, the run script is pretty complicated now but it seems to work. The update / no update indicates whether the data changed on the data portal. The success / failure indicates if the result of the try inherits the class "try-error" (R's roundabout way of letting us know that an error happened).

Possible email subjects: WNV PREDICTIONS - FAILURE - NO UPDATE WNV PREDICTIONS - SUCCESS - NO UPDATE WNV PREDICTIONS - SUCCESS - UPDATED WNV PREDICTIONS - FAILURE - UPDATED

Examples of body of email: Example with no update. Note that in this case none of the dependent scripts were called and so none of them failed. The error happened in a function in the body of "run_all.R", and the failure is indicated in the subject line only.

Email from C:/Users/375492/GitHub/WNV/WNV_model/run_all.R No change in https://data.cityofchicago.org/api/views/jqe8-8r6s/rows.csv?accessType=DOWNLOAD Predictions not updated Example with an update:

Email from C:/Users/375492/GitHub/WNV/WNV_model/run_all.R Data changed on https://data.cityofchicago.org/api/views/jqe8-8r6s/rows.csv?accessType=DOWNLOAD WNV Prediction update triggered

SUCCESS - R/10_calculate_idtable.R FAIL - R/21_create_features.R SUCCESS - R/40_upload_predictions_ROracle.R Each script is prefixed with its status.

tomschenkjr commented 7 years ago

There were two notes indicating future work:

"I'm going to test it once as normal with no update, then force it to update (by deleting the "digest", thus making it appear as if the data changed) and make sure the email works when the scripts run."
"I really should put the scripts into functions now that the development is stable."

geneorama commented 7 years ago

yes, the updates are working. I did the test but CDPH also updated the data Tuesday night. Both emails worked correctly for me (and I'm assuming you got them too?).

Maybe it would be good to put the scripts into functions, but I'm not sure if it matters or if it's just a "nice to have". It is definitely is not a requirement for the operationalization milestone and definitely not related to logging.

tomschenkjr commented 7 years ago

I’ve been getting the success emails, both for update and no-update.

Are you confident that the script will work is the update fails?

From: Gene Leynes [mailto:notifications@github.com] Sent: Thursday, June 22, 2017 10:53 AM To: Chicago/WNV_model WNV_model@noreply.github.com Cc: Schenk, Tom Tom.Schenk@cityofchicago.org; State change state_change@noreply.github.com Subject: Re: [Chicago/WNV_model] Logging enhancement - file cleanup and email on action (#31)

yes, the updates are working. I did the test but CDPH also updated the data Tuesday night. Both emails worked correctly for me (and I'm assuming you got them too?).

Maybe it would be good to put the scripts into functions, but I'm not sure if it matters or if it's just a "nice to have". It is definitely is not a requirement for the operationalization milestone and definitely not related to logging.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/Chicago/WNV_model/issues/31#issuecomment-310422520, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABkC0Z_p1z5s4TtDLQobu1Xm5DN0qrgIks5sGo3MgaJpZM4N436u.

This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail (or the person responsible for delivering this document to the intended recipient), you are hereby notified that any dissemination, distribution, printing or copying of this e-mail, and any attachment thereto, is strictly prohibited. If you have received this e-mail in error, please respond to the individual sending the message, and permanently delete the original and any copy of any e-mail and printout thereof.

geneorama commented 7 years ago

I'm not completely sure what you mean by "[if] the update fails". I think these notifications are pretty good, but they're not infallible.

I put a stop into the model script and the logic worked as I expected, so any exception in the script will generate an email that indicates a failure. Also, the email will tell you which script generated the exception.

If something really bad happens like the analytics server is down, then we'll stop getting "success" emails.

I didn't program to every conceivable scenario, but I did program to try and anticipate things that seem likely. If something weird happens, like CDPH starts updating a different table for TRAPS and never tell us, then things could appear to work for a while and there would be no explicit error, but we'd miss the locations of new traps.

tomschenkjr commented 7 years ago

What I mean by “if the update fails” is pretty basic, that your program runs but runs into something unexpected and provides an email alert that provides an alert to that.

From: Gene Leynes [mailto:notifications@github.com] Sent: Thursday, June 22, 2017 11:13 AM To: Chicago/WNV_model WNV_model@noreply.github.com Cc: Schenk, Tom Tom.Schenk@cityofchicago.org; State change state_change@noreply.github.com Subject: Re: [Chicago/WNV_model] Logging enhancement - file cleanup and email on action (#31)

I'm not completely sure what you mean by "[if] the update fails". I think these notifications are pretty good, but they're not infallible.

I put a stop into the model script and the logic worked as I expected, so any exception in the script will generate an email that indicates a failure. Also, the email will tell you which script generated the exception.

If something really bad happens like the analytics server is down, then we'll stop getting "success" emails.

I didn't program to every conceivable scenario, but I did program to try and anticipate things that seem likely. If something weird happens, like CDPH starts updating a different table for TRAPS and never tell us, then things could appear to work for a while and there would be no explicit error, but we'd miss the locations of new traps.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/Chicago/WNV_model/issues/31#issuecomment-310428442, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABkC0UdTbRxAwyVduLsiq5dfU64BMp_Eks5sGpKcgaJpZM4N436u.

This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail (or the person responsible for delivering this document to the intended recipient), you are hereby notified that any dissemination, distribution, printing or copying of this e-mail, and any attachment thereto, is strictly prohibited. If you have received this e-mail in error, please respond to the individual sending the message, and permanently delete the original and any copy of any e-mail and printout thereof.

Chicago / west-nile-virus-predictions

Logging enhancement - file cleanup and email on action #31