jgoerzen / simplesnap

A simple and powerful way to send ZFS snapshots across a network
GNU General Public License v3.0
49 stars 15 forks source link

Premature exit when a zfs receive has an error #1

Open phuybre opened 10 years ago

phuybre commented 10 years ago

If zfs receive has an error, simplesnap stops immediately and exits with errorcode 100. It is only reported in the log. Suggestion for improvement: store the error and continue with processing the next snapshot. Report the error after all is done via logerror (which sends an email)

jgoerzen commented 10 years ago

Are you running simplesnap from cron, or what situation did you encounter when you didn't see the error?

It is true that the error is logged only to the log, but the exit 100 should trigger cron to generate an email anyhow.

It could be possible to add a message to stderr about the zfs receive exit code.

John

On 06/21/2014 01:12 AM, phuybre wrote:

If zfs receive has an error, simplesnap stops immediately and exits with errorcode 100. It is only reported in the log. Suggestion for improvement: store the error and continue with processing the next snapshot. Report the error after all is done via logerror (which sends an email)

— Reply to this email directly or view it on GitHub https://github.com/jgoerzen/simplesnap/issues/1.

phuybre commented 10 years ago

From cron. I'll have to check why the zfs receive failed, probably because there was some heavy pruning on the destination machine of all the replicated snapshots.

Emails are sent for the "could not obtain lock...." message, so the basic error reporting is fine.

But the end result was that the filesystems scheduled after the faulty filesystem weren't replicated, which was a bit surprising.