RJVB / afsctool

This is a version of "brkirch"'s afsctool utility that allows end-users to leverage HFS+ compression.
https://brkirch.wordpress.com/afsctool
GNU General Public License v3.0
187 stars 18 forks source link

-t and -i options usage #57

Open kapitainsky opened 2 years ago

kapitainsky commented 2 years ago

I have spent long time trying to figure out how to compress files excluding specifix extensions and can not figure out how to do it.

Lets say I want to compress all files but JPG,jpg,JPEG,jpeg

Now when I try:

afsctool -c -i -t JPG,jpg,JPEG,jpeg -v -T LZFSE .

it still compress these files.

I know I can use -s option but objective here is to speed up compressing many TB of data where many files are simply not compressible so it is much better to exclude them from the start.

I have played with -t option but genuinely can't figure out how to use it to get expected results. Neither exlusion nor inclusion.

Anybody used these options sucessfully and could give me a hint what I am doing wrong?

kapitainsky commented 2 years ago

The simple attempt to try to list only txt files compression status:

afsctool -t txt .

leads to program entering never ending loop and printing forever below info:

File content type: public.plain-text
File extension(s): txt
Number of HFS+/APFS compressed files: 1

Number of HFS+/APFS compressed files: 4
/disk/txt: No such file or directory
/disk/.:

It looks like arguments parser get it wrong.

RJVB commented 2 years ago

I did not implement the -t option and don't think I even messed with the code except maybe to let it work while compressing too (and/or to add the explanation on the 2nd line).

I'm not at my Mac these days but looking at the code it seems that the selection feature only works when you point afsctool to folder(s), not if you give a list of individual files to compress (IMHO that makes sense).

There's an undocumented "ALL" content type which should allow you to test the feature; -i -t ALL should mean no files of the target folder will be compressed.

kapitainsky commented 2 years ago

Thank you for quick reply but not sure I understand.

I have just tried:

afsctool -c -t txt .

my understanding is that this should compress all txt files in current folder.

It does not - it enters never ending loop dispalying again and again:

/disk/-t: No such file or directory
/disk/txt: No such file or directory
kapitainsky commented 2 years ago

To be precise I think it works (txt file was compressed) but looks that -t is parsed correctly but then as you allow to specify multiple folders your code parses -t and txt as they belong to list of folders

kapitainsky commented 2 years ago

The same when I try to invert selection:

afsctool -c -i -t txt .

result - looping with message:

Totals of file content types
Number of HFS+/APFS compressed files: 0

Number of HFS+/APFS compressed files: 1
/disk/-i: No such file or directory
/disk/-t: No such file or directory
/disk/txt: No such file or directory
/disk/.:

Bottom line is that -i and -t options parsing is broken

RJVB commented 2 years ago

This is why I put in plenty of conditional code to be able to test everything but the (de)compression on Linux... I can reproduce your issue, and will try to fix it soonish.

It seems it's only the -t parsing that goes wrong somewhere though

BTW, -t takes a single argument only; it has to be specified multiple times if you want to declare multiple file types/extensions.

In the meantime, please use an external solution (find?). Or trust that the kind of files you want to exclude will be rejected quickly because a chunk that "compresses" to a larger size is encountered very quickly (so don't use the -L option nor LZVN compression).

kapitainsky commented 2 years ago

This is why I put in plenty of conditional code to be able to test everything but the (de)compression on Linux... I can reproduce your issue, and will try to fix it soonish.

This is great news! Thank you.

-t takes a single argument only; it has to be specified multiple times if you want to declare multiple file types/extensions.

This is fine. As long as documented. But this is detail.

In the meantime, please use an external solution (find?). Or trust that the kind of files you want to exclude will be rejected quickly because a chunk that "compresses" to a larger size is encountered very quickly (so don't use the -L option nor LZVN compression).

yes I could do this with 'find' but then I lose parallel processing - so it becomes real slow. In my case I want to process many TB dataset which can contain hundreds thousands of files. I know that many of these files cant be compressed anyway so I want to speed up all process by excluding them. Obvious suspects like jpg, zip, gz etc.

kapitainsky commented 2 years ago

and BTW afsctool is fantastic - extremely useful tool

RJVB commented 1 year ago

yes I could do this with 'find' but then I lose parallel processing - so it becomes real slow.

No! If you activate parallel processing with -j or -J, the files specified on the commandline will be added to a queue (regardless of whether you specify the folders they are in, or individual files). That queue is than emptied by worker threads.

kapitainsky commented 1 year ago

yes I could do this with 'find' but then I lose parallel processing - so it becomes real slow. No! If you activate parallel processing with -j or -J, the files specified on the commandline will be added to a queue (regardless of whether you specify the folders they are in, or individual files). That queue is than emptied by worker threads.

ok so it is some workaround - thank you for clarification

but clearly you suggest something else than:

find pattern -exec afsctool

??