The updating of dircache entries in ls when dealing with recursive listings (which might contain files across multiple parent directories) needs to be improved.
The current implementation does not add all subdirectories from the recursive listing to the cache individually, but rather just the first one encountered.
Original discussion:
I think this goes in the direction of my forgotten debugging statement in the original PR, which was collecting all the directories that were part of the list_objects response.
We currently have this, which I believe is not enough:
# assumes that the returned info is name-sorted.
pp = self._parent(info[0]["name"])
Rather, we need to add every parent directory contained in the response to the cache (not just that of the first item), as you mentioned above. That's an easy enough fix, but it should be accompanied by a corresponding test.
The updating of dircache entries in
ls
when dealing with recursive listings (which might contain files across multiple parent directories) needs to be improved.The current implementation does not add all subdirectories from the recursive listing to the cache individually, but rather just the first one encountered.
Original discussion:
We currently have this, which I believe is not enough:
Rather, we need to add every parent directory contained in the response to the cache (not just that of the first item), as you mentioned above. That's an easy enough fix, but it should be accompanied by a corresponding test.
_Originally posted by @AdrianoKF in https://github.com/aai-institute/lakefs-spec/pull/198#discussion_r1420246607_