"No success message": success regex isn't reliable indicator of "success"

mattpr commented 10 months ago

Describe the bug

Stripping metadata from PDF files during a "build" process, I'm hitting an error due to a regex mismatch.

Error: No success message: 0 image files updated

The regex (/1 image files? (?:created|updated)/i) is looking for 1 image files.

However as we are stripping tags that may or may not be there, the output from exiftool may be something like this:

ie the below will result in exiftool-vendored blowing up:

% exiftool -author= "path/to/some.pdf"
    0 image files updated
    1 image files unchanged
% echo $?
0

...the above looks like success to me. exiftool didn't do anything because some.pdf did not have an Author tag.

Unfortunately this lib just throws an error in these cases. Sure I can catch it and ignore it...but if exiftool actually failed (missing binary, input file doesn't exist, etc)...I want to know and parsing error messages from your lib feels like a brittle way to do this.

% exiftool -author= "path/to/does-not-exist.pdf"
Error: File not found - path/to/does-not-exist.pdf
    0 image files updated
    1 files weren't updated due to errors
% echo $?
1

I think it would be better to rely on exit code from exiftool rather than trying to parse stderr/out and then surface stderr/out when you get a non-zero exit.

But the quick fix would be to add unchanged to the regex.

mceachen commented 10 months ago

Thanks for bringing this issue up!

I think it would be better to rely on exit code from exiftool

For performance reasons, I run exiftool in "batch" mode, so there's no exit code to check.

The errors (like file not found) are easy to handle, as ExifTool will emit to stderr--I think it's defensible to throw an error if exiftool renders to stderr.

but if exiftool actually failed (missing binary, input file doesn't exist, etc)...I want to know and parsing error messages from your lib feels like a brittle way to do this.

Totally agreed. So--I could

throw a custom error with some field you could check to see why it failed, or
only throw if ExifTool emits to stderr, and update ExifTool.write to return a Promise<something> that indicates the operation was a no-op.

I think I prefer 2. It's a breaking change, but I think it's better than what I've got currently.

So: what's something look like?

{state: "created" | "updated" | "unchanged"}
{noop: boolean}
{created?: number; updated?: number; unchanged?: number}
something else (suggestions welcome)

Option 1. allows for discrimination for creates vs updates, but checking a boolean might be a bit simpler than messing with a string enum.

Option 3 requires consumers to query a bunch of fields, but supports writes to filenames with globs, like:

exiftool -Author= foo*.xmp
    1 image files updated
    6 image files unchanged

mceachen commented 10 months ago

OK, I'm implementing Option 3:

export interface WriteTaskResult {
  created: number
  updated: number
  unchanged: number
  warnings?: string[]
}

mattpr commented 10 months ago

Hey! Thanks for the fast response and turnaround on a patch. I appreciate it!

photostructure / exiftool-vendored.js

"No success message": success regex isn't reliable indicator of "success" #162