Closed chriscz closed 7 years ago
I am interested in solving this problem
Hi @toondegryse,
The way I would go about it is abstracting all the file / directory methods to a single class. Then making two implementations of the class:
Implementing the second one is what will be the tricky part, as you must keep track of how the real filesystem has changed.
Using a dictionaries and sets may be useful here. Here's my idea of the base class:
class Filesystem(object):
def walk(path):
"""
Same behaviour as os.walk
"""
pass
def move(path, destination):
"""
Same behaviour as the two separate move_file and move_directory functions
"""
pass
def open(path, mode='r'):
"""
Returns an open filehandle for the file. This needn't be implemented
"""
pass
def list(path):
"""
Returns a tuple (dirs, files) for the given directory path
"""
pass
def exists(path):
"""
Does this path exist?
"""
def is_file(path):
"""
Is the given path a file?
"""
def is_dir(path):
"""
Is the given path a directory?
"""
Hi Chriscz,
There was no need to abstract the folder operations in order to add the -n feature. I've added the feature on https://github.com/toondegryse/pysorter. Please have a look before I pull it without abstracting the folder operations the Filesystem class, which I could still do.
Thanks, Toon
Hi Toon,
I'm don't think that's going to work, since imagine the scenario where you do a recursive sort and a directory would have been moved. During the dry-run, the directory would not be moved, and so the recursive traversal will enter the directory.
There is an alternative to abstracting, you would need to keep track of all the directories that have been moved, and exclude them from os.walk
. This is a simpler solution than the above.
You will need to write testcases that cover all these scenarios. I'd also recommend looking at the comments on PR #20.
Whoops, clicked close by accident.
Basically we can keep your solution, just slightly modify it, addressing the changes I suggested in PR #20 and removing the directories from os.walk
(you can repurpose self.no_recurse
). This should be a simple fix.
Great, I will have a look!
Hi Chris,
Could you clarify what you mean be removing the directories? Is following output what you would like to see or not?
move asong.mp3 --> /output/audio/asong.mp3 move meconflict.jpg --> /output/images/jpg/meconflict.jpg move conflictsong.mp3 --> /output/audio/conflictsong.mp3
I have implemented the changes suggested in PR #20 and added a test in test_sorter.py
Thanks! Toon
So consider the following scenario. The file / folder structure:
tosort/folder/movie.mp4
tosort/song.mp3
tosort/anotherdir/song.mp3
The rules
# filetypes.py
('[^|/]folder/$', 'directories/'),
('\.mp3$', 'music/'),
('\.mp4$', 'movie/')
The command: pysorter -r -p -l -f filetypes.py -d output
-l
) here-p
)The log we see on screen (skip is included here for clarity)
move folder --> output/directories/folder
move song.mp3 --> output/music/song.mp3
skip anotherdir/song.mp3
However if I understood your implementation correctly, then executing the command:
pysorter -r -p -n -f filetypes.py -d output
I would see:
move folder --> output/directories/folder
move folder/movie.mp4 --> output/movie/movie.mp4
move song.mp3 --> output/music/song.mp3
move anotherdir/song.mp3 --> output/music/song.mp3
There are two problems with this log,
song.mp3
)folder
(which should have been moved)Hi Chris,
With PR #24 in mind, I have added a dictionary collecting the destination paths so I can show the message move or skip for each file in the terminal. Please have a look.
Thanks, Toon
Hey Toon,
I still don't think it addresses the issue. The code added to move_file and move_dir should be in sorter.py as mentioned in my comments on #20, also, that skip will never be reached, because of the logic inside sorter.py.
Lastly global variables are a very bad idea, more reason to lift that code to where it's important. I.e. sorter.py
Please create a pull request so that I may respond to individual changes.
Hey Toon,
Why don't you move the code to sorter.py
.
Everything you're doing can be done in sorter.py
.
You won't need to pass print_changes
and keep global variables.
#inside PySorter.__init__()
self.files = {}
self.teller = 0
#inside PySorter.process()
if not self.do_print_changes:
if fs.is_file(path):
fs.move_file(path, dst)
else:
fs.move_dir(path, dst)
else:
if dst in self.files.itervalues():
print("skip {}".format(src))
else:
self.files[self.teller] = dst
self.teller += 1
print("move {} --> {}".format(src,dst))
I've gone ahead and made these changes myself
Behaviour
When the dry-run flag
-n
is given then no changes should be made to the filesystem. Instead a printout of all the changes should occur.For a clean implementation this will require wrapping
os.walk
and someos.path
functions, possible in a thin wrapper class.