FileDownload: write .tmp files at destination and only rename to final pfn after successful validation

nikmagini commented 9 years ago

Hi,

suggestion by Brian for FileDownload agent, to avoid issues with file overwrite, and to prevent incomplete files from appearing in the namespace too early. Change the FileDownload agent to:

Copy SiteA:/store/foo -> SiteB:/store/foo.tmp
Rename ("srmmv") SiteB:/store/foo.tmp -> SiteB:/store/foo

Note that in case of transfer failure, on the retry the FileDownload agent will need to clean up the previous .tmp file possibly left behind by a previous attemp.

Any .tmp file left behind after that can be cleaned up as dark data during regular consistency check campaigns

bbockelm commented 9 years ago

To be clear, you probably want to do a randomized filename - or prefix with "." - to avoid collisions with "real" files.

TonyWildish commented 9 years ago

is there a real use-case for this? Do we have a problem that really needs to be solved like this?

bbockelm commented 9 years ago

I think the two cases are:

Avoiding deleting the existing file until you know there's a fully-downloaded-and-verified new copy locally.
Preventing files from appearing in the CMS namespace (for things like overflow and other Xrootd access) until they have been verified. Right now, downloads-in-progress are accessible to jobs.

TonyWildish commented 9 years ago

For the first case, presumably we are downloading because the existing copy is no good, so I don't see the need to preserve the original copy?

For the second case, an alternative is for xrootd to not serve a file that was modified in the last N seconds. 20-30 seconds ought to be enough.

So I don't see strong use-cases here. I'm loathe to do something like this for a couple of reasons:

1) it will inevitably leave crud in the system. Instead of leaving a single corrupt file it can leave multiple corrupt copies of that file, which can mean more dark data to clean out

2) it doubles the number of times PhEDEx has to touch the SE. Even if it's only for a rename operation, I prefer not to do that.

So I still need convincing...?

bbockelm commented 9 years ago

For the first case, that assumption led to data loss!

For the second case, why do you assume overflow accesses via Xrootd? The job may be accessing local files via POSIX.

TonyWildish commented 9 years ago

for 1: yes, I can believe that. But why? Why was the file being downloaded by PhEDEx if it was already there and good? Why didn't PhEDEx know that? Was there a bug in the sites' FileDownloadVerify? If so, let's fix that! Without numbers to convince me that this is something PhEDEx needs to tackle I don't see the point. Let's hear what the real reason is for this supposed data-loss before hacking the code, there's probably something else that should be fixed instead.

For 2, if the read is happening via POSIX then something (CRAB, or whatever tool is being used) thinks the file is there while PhEDEx thinks it isn't. So how did that job discover the file? I assumed it would be xrootd because that's the easy way to find files by local lookup without having to know which SE to look on first, otherwise the user is going to be specifically looking up files at that site in the first place. So if it's not xrootd, you're in a smaller corner-case than otherwise.

Put it this way: the problem you describe happens only if:

the tool opens the file after PhEDEx has started writing it, but before it has finished.
the file was good in the first case, before PhEDEx started writing it again, so the job would not have crashed
this in turn requires that the file was good but PhEDEx thinks it's bad, which cannot happen unless there are bugs elsewhere in the system.

That's a very small window of in-opportunity, and you have done nothing to convince me that it's frequent enough to be important, or that this is the right fix for the problem. Given that it can only arise because of errors elsewhere in the first place, I prefer that we address those instead.

If you have other arguments, and numbers to go with them, I'm happy to hear them. Otherwise it sounds like something to take up with the sites whose FileDownloadVerify scripts would appear to be broken.

dmwm / PHEDEX

FileDownload: write .tmp files at destination and only rename to final pfn after successful validation #984