chriscz / pysorter

A command line utility for organizing files and directories according to regex patterns.
Mozilla Public License 2.0
44 stars 17 forks source link

Implement a dry-run flag #14

Closed chriscz closed 7 years ago

chriscz commented 8 years ago

Behaviour

When the dry-run flag -n is given then no changes should be made to the filesystem. Instead a printout of all the changes should occur.

For a clean implementation this will require wrapping os.walk and some os.path functions, possible in a thin wrapper class.

toondegryse commented 8 years ago

I am interested in solving this problem

chriscz commented 8 years ago

Hi @toondegryse,

The way I would go about it is abstracting all the file / directory methods to a single class. Then making two implementations of the class:

  1. one for working on the actual filesystem,
  2. and one that has a fake view of the actual filesystem (it would keep track of any changes that have been made to the filesystem, instead of really applying the changes. A type of journaling mechanism)

Implementing the second one is what will be the tricky part, as you must keep track of how the real filesystem has changed.

Using a dictionaries and sets may be useful here. Here's my idea of the base class:

class Filesystem(object):
    def walk(path):
        """
            Same behaviour as os.walk
        """
        pass

    def move(path, destination):
        """
            Same behaviour as the two separate move_file and move_directory functions
        """
        pass

    def open(path, mode='r'): 
        """
            Returns an open filehandle for the file. This needn't be implemented 
        """
        pass

    def list(path): 
        """
            Returns a tuple (dirs, files) for the given directory path
        """
        pass

    def exists(path):
        """
            Does this path exist?
        """

    def is_file(path):
        """
            Is the given path a file?
        """

    def is_dir(path):
        """
            Is the given path a directory?
        """
toondegryse commented 8 years ago

Hi Chriscz,

There was no need to abstract the folder operations in order to add the -n feature. I've added the feature on https://github.com/toondegryse/pysorter. Please have a look before I pull it without abstracting the folder operations the Filesystem class, which I could still do.

Thanks, Toon

chriscz commented 8 years ago

Hi Toon,

I'm don't think that's going to work, since imagine the scenario where you do a recursive sort and a directory would have been moved. During the dry-run, the directory would not be moved, and so the recursive traversal will enter the directory.

There is an alternative to abstracting, you would need to keep track of all the directories that have been moved, and exclude them from os.walk. This is a simpler solution than the above.

You will need to write testcases that cover all these scenarios. I'd also recommend looking at the comments on PR #20.

chriscz commented 8 years ago

Whoops, clicked close by accident. Basically we can keep your solution, just slightly modify it, addressing the changes I suggested in PR #20 and removing the directories from os.walk (you can repurpose self.no_recurse). This should be a simple fix.

toondegryse commented 8 years ago

Great, I will have a look!

toondegryse commented 8 years ago

Hi Chris,

Could you clarify what you mean be removing the directories? Is following output what you would like to see or not?

move asong.mp3 --> /output/audio/asong.mp3 move meconflict.jpg --> /output/images/jpg/meconflict.jpg move conflictsong.mp3 --> /output/audio/conflictsong.mp3

I have implemented the changes suggested in PR #20 and added a test in test_sorter.py

Thanks! Toon

chriscz commented 8 years ago

So consider the following scenario. The file / folder structure:

tosort/folder/movie.mp4
tosort/song.mp3
tosort/anotherdir/song.mp3

The rules

# filetypes.py
('[^|/]folder/$', 'directories/'),
('\.mp3$', 'music/'),
('\.mp4$', 'movie/')

The command: pysorter -r -p -l -f filetypes.py -d output

The log we see on screen (skip is included here for clarity)

move folder --> output/directories/folder
move song.mp3 --> output/music/song.mp3
skip anotherdir/song.mp3

However if I understood your implementation correctly, then executing the command: pysorter -r -p -n -f filetypes.py -d output

I would see:

move folder --> output/directories/folder
move folder/movie.mp4 --> output/movie/movie.mp4
move song.mp3 --> output/music/song.mp3
move anotherdir/song.mp3 --> output/music/song.mp3

There are two problems with this log,

  1. It shows that a file would be overridden (song.mp3)
  2. It recursively enters folder (which should have been moved)
toondegryse commented 8 years ago

Hi Chris,

With PR #24 in mind, I have added a dictionary collecting the destination paths so I can show the message move or skip for each file in the terminal. Please have a look.

Thanks, Toon

chriscz commented 8 years ago

Hey Toon,

I still don't think it addresses the issue. The code added to move_file and move_dir should be in sorter.py as mentioned in my comments on #20, also, that skip will never be reached, because of the logic inside sorter.py.

Lastly global variables are a very bad idea, more reason to lift that code to where it's important. I.e. sorter.py

Please create a pull request so that I may respond to individual changes.

zeerorg commented 8 years ago

Hey Toon, Why don't you move the code to sorter.py. Everything you're doing can be done in sorter.py. You won't need to pass print_changes and keep global variables.

#inside PySorter.__init__()
self.files = {}
self.teller = 0

#inside PySorter.process()
    if not self.do_print_changes:
        if fs.is_file(path):
            fs.move_file(path, dst)
        else:
            fs.move_dir(path, dst)

    else:
        if dst in self.files.itervalues():
            print("skip {}".format(src))
        else:
            self.files[self.teller] = dst
            self.teller += 1
            print("move {} --> {}".format(src,dst))
chriscz commented 7 years ago

I've gone ahead and made these changes myself