mxmlnkn / ratarmount

Access large archives as a filesystem efficiently, e.g., TAR, RAR, ZIP, GZ, BZ2, XZ, ZSTD archives
MIT License
915 stars 39 forks source link

swallows directories with same name #142

Closed chripo closed 2 months ago

chripo commented 2 months ago

Thanks for the project!

I've run into problems mounting directories with the same name. It's caused by reducing the path to the last folder https://github.com/mxmlnkn/ratarmount/blob/e89578d71fe493e6b26efac5190926d6049f65e8/ratarmount.py#L600

mkdir -p foo/mnt
cd foo
for i in a b c; do mkdir -p $i/files; echo $i > $i/files/$i; done
ls */files/*
# a/files/a  b/files/b  c/files/c
ratarmount ./a/files ./b/files ./c/files ./mnt/
ls  ./mnt/
# a

I think it would simplify the code to reduce mountSources to a simple de-duplicated list, but would require a larger refactoring of SubvolumesMountSource.

mxmlnkn commented 2 months ago

Hi there,

thanks for the detailed bug report! I can reproduce the problem. My expectation would also have been that all three files should be visible. Your linked code location looks like the culprit. I can't remember why I didn't think about colliding keys when writing that code. I'm pretty sure this bug got introduced when adding SubvolumesMountSource.

I don't fully understand your conclusions. Your example would use UnionMountSource, not SubvolumesMountSource. The latter is used when you specify the --disable-union-mount option.

I think as a quick fix it should be fine to reintroduce a simple list to be given to UnionMountSource. I'm not sure how to handle SubvolumesMountSource. It was not intended for mounting multiple archives with the same name... I guess the names could be made unique by adding suffixes, but that also has detriments.

I don't see what I would refactor in SubvolumesMountSource. It needs more like a redesign of its arguments after answering the above conundrum. What refactoring did you have in mind?

chripo commented 2 months ago

my thoughts are to simplify the code, by refactoring mountSources into a list. but then it's get passed into SubvolumesMountSource which use the dict as lookup. at the moment i don't get all the details of SubvolumesMountSource but i think it would also help to simplify its internals.

chripo commented 2 months ago

you are right, in this case it gets passed into UnionMountSource. but it would still require to adjust SubvolumesMountSource.

chripo commented 2 months ago

SubvolumesMountSource could build its own LUT/dict in its constructor.