dagolden / Path-Tiny

File path utility
41 stars 59 forks source link

Misleading error message for spew in non-existent dir #290

Closed johannessen closed 5 months ago

johannessen commented 6 months ago

When I tell Path::Tiny 0.144 (on Perl v5.39.9) to spew or spew_raw a file into a directory that hasn't been created yet, the error message misleadingly states the wrong file name:

perl -MPath::Tiny -e 'path("dir_does_not_exist/file")->spew("1")'

Expected:

Error sysopen on 'dir_does_not_exist/file': No such file or directory at -e line 1.

Actual:

Error sysopen on 'dir_does_not_exist/file4668830033674': No such file or directory at -e line 1.
ap commented 6 months ago

Only thing is that that is in fact the name of the file Path::Tiny actually did try to open. If it had succeeded in creating that file and writing the contents to it then it would afterwards have renamed that file to dir_does_not_exist/file. It does this dance because if it didn’t, and some other process was looking for dir_does_not_exist/file, then the other process might open it before Path::Tiny was finished writing to the file. But the OS filesystem layer promises that renaming a file will not be performed before the file is fully flushed to disk, so Path::Tiny does this dance in order to cash in that guarantee.

But now there is a problem with the error message: what you think is misleading is actually what’s really going on – and if the error message named the file you passed, rather than the one Path::Tiny actually tried to open, then that would in fact be misleading. But of course you’re right that the error is confusing if you’re not aware of what’s going on.

@xdg: maybe the answer here is to mention both in the error message? (Something along the lines of Error sysopen on 'path/to/file39823948743' (temporary name for 'path/to/file'). Could get longwinded though… hmm.)

karenetheridge commented 6 months ago

How about we check if the directory exists first, and then the Error can be "Error: target directory 'dir_does_not_exist' does not exist" ? That might be racey, but for 99.9999% of cases we can avoid the confusing error.

johannessen commented 6 months ago

I do remember reading about the renaming (which of course is good practice) in the docs. Thanks for reminding me; now at least I understand what’s going on. The fact is, even though I did read it in the past, I didn’t think of it when I encountered this issue just then.

Perhaps I would have thought of it if the temporary name hinted at the fact it’s temporary. Doesn’t have to be something in /tmp/..., but what about a name like dir_does_not_exist/file-tmp4668830033674?

ap commented 6 months ago

How about we check if the directory exists first, and then the Error can be "Error: target directory 'dir_does_not_exist' does not exist" ? That might be racey, but for 99.9999% of cases we can avoid the confusing error.

It’s an option, but I’m -1 on reporting that instead of the actual error. If I had to debug an error I’d want to know the specific operation that failed and the actual error it got, not just the error-handling code’s guess at the reason for the error.

And at that point the error starts to get rather wordy. It’s also extra code to handle just one particular case. (It might not even be the immediate parent directory that’s missing, but one further up the path. Do you try to handle that? What about permissions issues? Etc.) I’m curious how @xdg feels about it but personally I would be disinclined.

Perhaps I would have thought of it if the temporary name hinted at the fact it’s temporary. Doesn’t have to be something in /tmp/..., but what about a name like dir_does_not_exist/file-tmp4668830033674?

Not sure. Remember that path names are limited in length; if you look at the _replacement_path method, it already has code to check whether there’s even enough room for the original filename, and it will omit that part and use only the random string if it is forced to. Now on one hand, that code already exists and it wouldn’t be a stretch to extend it to first try to include a tmp prefix after the original filename, then try with just the original filename, and only then use just the random string. But OTOH, that still won’t help with diagnosing issues in cases where the code is forced to omit the tmp prefix.

(As for /tmp/, it actually can’t go there for a number of reasons. One is that it might be on a different filesystem, in which case renaming is not even possible. (Even on the same filesystem, the guarantees regarding renames across directories aren’t necessarily as solid as within the same directory.) More importantly, if you put the temporary file in the same directory then the right permissions are checked up front, whereas otherwise you only get them in the rename step, and depending on filesystem you might not get exactly the same effects either (e.g. if the target directory has any sticky bits – the created file might end up with different permissions and/or ownership than if you created it in the target directory).)

@xdg: idle thought after typing this all up: maybe the fact that the code goes to some lengths to place the file in the target directory and only change the final path component means that the error message could omit the directories for the original path, so more like Error sysopen on 'path/to/file39823948743' (using temporary filename for 'file'). Then again the code does first call _resolve_symlinks, so the path passed to the syscall might also have a different path part than what the caller passed in – so maybe in that case (only!) mention the full original path (e.g. Error sysopen on 'path/to/file39823948743' (resolved path with temporary filename for 'symlinkeddir/to/file'))? Basically what I’m trying is to make the error message as truthful as possible.

johannessen commented 6 months ago

Being truthful sounds good to me!

Another idle thought: Error sysopen on 'path/to/file39823948743' (temp file)? Not as informative, but shorter than stating both file names.

ap commented 6 months ago

Or maybe “temporary name”?

Anyway, let’s see what @xdg thinks.

xdg commented 5 months ago

Hi. Catching up here. Would something like this be what we want?

$ perl -Ilib -MPath::Tiny -e 'path("dir_does_not_exist/file")->spew("1")'
Error spew on 'dir_does_not_exist/file': error opening temp file 'dir_does_not_exist/file42313595292579' for atomic write: No such file or directory at -e line 1.
ap commented 5 months ago

Yes, nice. LGTM

johannessen commented 5 months ago

0.145-TRIAL looks good to me so far. Thanks!

xdg commented 5 months ago

Thanks for the feedback! I have a calendar reminder to ship stable next week after I check CPAN Testers.