hasse69 / rar2fs

FUSE file system for reading RAR archives
https://hasse69.github.io/rar2fs/
GNU General Public License v3.0
275 stars 27 forks source link

Fix parse error when filename contains ']' #181

Open wesley800 opened 1 year ago

wesley800 commented 1 year ago

As title says.

BTW I also changed all strstr aiming for prefix matching to strncmp. I wonder if all these changed positions needs prefix matching, or precise matching? If the latter, could we just use strcmp instead?

hasse69 commented 1 year ago

Thanks for the PR. However, even though I can understand the patch related to ']' in a file name, I lack a proper example of when it does not work in order for me to test and confirm the change you suggest. Also, I am a bit reluctant to change the use of strstr() unless you can properly motivate such a change with whatever benefit it would bring to the table. Not saying it is wrong, just that the need is not instantly appearing to me. And to answer your question, exact matching is very likely not applicable where strstr() is used. If that would be the case I am rather certain strcmp() would have been the natural choice already.

wesley800 commented 1 year ago

A quick example, of which the non-patched version would complains ERAR_MISSING_PASSWORD.

echo -e '[[enc].rar]\npassword="test"' > conf
dd if=/dev/random of=rd.bin bs=4096 count=1
rar a -hptest '[enc].rar' rd.bin
mkdir -p mnt
rar2fs --config=$PWD/conf '[enc].rar' mnt

As for strstr, to be honest I can't follow all the code path well. e.g., I'm not very sure if s is always either empty or a prefix of rar here (https://github.com/hasse69/rar2fs/blob/master/src/rar2fs.c#L484). If so, the correctness would be obvious. I also want to be sure if the hash table is by design matching only the prefix of key, maybe for split rar files? And then the main purpose is to enhance the readability of the code, making it clear that prefix matching here is intended. Another reason is to avoid overhead of strstr, which AFAIK takes additional space and k(m+n) time for building KMP search array, while strncmp+strlen takes no additional space and only ~2n time, where n is the length of the "prefix", m the "original string", and k a factor around ~10. Anyway I'm also not confident if any strstr lies on FUSE file accessing code path, or all of them are just used during initialization. If the latter, the performance boost would be nearly meaningless.

hasse69 commented 1 year ago

Thanks for the short reproducer. I will try it and check your patch.

The 'strstr' example you provided is an obvious substring/prefix match. The input file name is an absolute path while the config file works mount point relative.

I will check your other 'strstr' replacements but I am pretty sure the purpose is (or used to be at least) to match on a possible sub-path and not for a guaranteed exact match. I hardly think you would see any real performance boost though by replacing 'strstr' with 'strlen' considering the simple use-case. Thus it would basically boil down to readability only and as such, highly subjective. But I can maybe agree that it should appear somewhat easier to understand the logic (due to lack of comments) if a function is used that by name reveal the intent.

wesley800 commented 1 year ago

But I can maybe agree that it should appear somewhat easier to understand the logic (due to lack of comments) if a function is used that by name reveal the intent.

Exactly, and simply adding a comment seems a better choice, considering the "It just works so don't touch anything not necessary" rule. Besides I also tried to compare the performance of these two methods and found almost no difference (https://gist.github.com/wesley800/a5473034f99ec1dc2bec41b6baa1f3cc).

hasse69 commented 1 year ago

Can you please file an issue report since this is an obvious bug in the configuration file parser and link to the pull request.