mcandre / toys

code demos for newbies
https://github.com/mcandre/toys
30 stars 6 forks source link

Defend against spaces in UNIX find results #488

Closed mcandre closed 5 months ago

mcandre commented 5 months ago

Plain xargs blows up on large data sets.

xargs -n 1 is slow.

xargs -n 100 is vulnerable to spaces.

xargs -0 -n 100 isn't strictly POSIX compatible.

shell for loops exhibit many of these problems.

There is an IFS ... read -r ... snippet that does the work reliably and portably. Have to lookup the syntax again. It's an ugly block of code that won't work in scripting contexts that limit syntax to very simple, exec style commands.

mcandre commented 5 months ago

https://stackoverflow.com/a/36375034

Like, zoinks, Scoob.

mcandre commented 5 months ago

That IFS read snippet will be a chore to integrate into larger scripting systems that wrap shell, like CI/CD configurations.

However, -print0 / -0 would make results post-processing a la grep more difficult.

mcandre commented 5 months ago

We don't always produce the input data to xargs with find. Sometimes we use a completely different application, like the stank shell script search tool.

Do we implement a -print0 flag there? What about tools that we do not maintain, that are likely to scoff at adding the feature?

mcandre commented 5 months ago

One nice thing about sending the data without null delimiters, is that it is easy to tweak the results with grep. This is really useful for filtering out junk files from assorted applications.

Without this ability, we have to wait longer development cycles for any results narrowing to be supported upstream.

mcandre commented 5 months ago

We should also adopt -execdir.

mcandre commented 5 months ago

Briefly experimented with sorting find results, but there are few portable options. BSD find's -s flag means something completely different in GNU find, and is unsupported in POSIX find.

Not that POSIX find has received -execdir yet.