google / magika

Detect file content types with deep learning
https://google.github.io/magika/
Apache License 2.0
7.85k stars 413 forks source link

`magika -r /nonexistant` exits with 0 #780

Open reyammer opened 2 hours ago

reyammer commented 2 hours ago

Consider:

$ uv run magika -r tests_data
tests_data: No such file or directory (os error 2) (error)
$ echo $?
0

This currently exits with 0. I would have expected an error code (this is one of the assumption for the gh workflow test suite).

Unclear what's the cleanest thing to do. If there at least one file that gets predicted without errors we could still return 0; but it's weird to return "no problems" when there is not a single file that has been successfully scanned? Maybe we return non-zero if no files has been scanned; zero otherwise?

Thoughts @ia0?

ia0 commented 2 hours ago

Nice catch. I see multiple options which can be controlled by a flag:

I would personally be fine with only the first option and no flag.

reyammer commented 2 hours ago

I like the first option.

To be specific, we could do: in case of errors with a one or more files, I would anyways keep going forward and scan everything we can, and just exit 1 at the very end to signal "something went wrong with at least one file", where "went wrong" means something like permission error, file does not exist, and the like.