letorbi / synoindexwatcher

An automated media index updater for Synology DiskStations.
GNU General Public License v3.0
69 stars 7 forks source link

.SYNOPPSDB necessary? #14

Closed pmcl77 closed 4 years ago

pmcl77 commented 4 years ago

Hi

I managed to exclude .sync directories (those are from resilio sync that I use to sync and backup my photos to the Diskstation) by adding this piece of code in the appropriate section:

# exclude .sync directories if os.path.basename(filepath) == b".sync": return False However when adding photos, I notice in the log there are files like the following being indexed: 2019-11-16 22:17:09,950 INFO synoindex -a /volume2/homes/phil/photo/.SYNOPPSDB-journal 2019-11-16 22:17:10,000 INFO synoindex -a /volume2/homes/phil/photo/.SYNOPPSDB

So I am not sure if this could simply be excluded as well because they are not mediafiles themselves? or maybe this is necessary?

thanks

pmcl77 commented 4 years ago

For now I have added another few lines to exclude those .synoppsdb files from being indexed:

if os.path.basename(filepath).startswith('.'): return False

So is_allowed_path looks like this now:

def is_allowed_path(filepath, is_dir = True):
    # Don't check the extension for directories
    if not is_dir:
        ext = os.path.splitext(filepath)[1][1:].lower()
        if ext in excluded_exts:
            return False
    if os.path.basename(filepath).startswith('.'):
        return False
    if os.path.basename(filepath) == b"@eaDir":
        return False
    # exclude .sync directories
    if os.path.basename(filepath) == b".sync":
        return False
    return True

PS: sorry, apparently the newlines are not recognized when using the code block here.

letorbi commented 4 years ago

Thanks for the fix! Would you mind to create a pull request? This way your work would be actually attributed to you in the git history.

PS: I've fixed the newline-issue in your previous post ;)

pmcl77 commented 4 years ago

Hi

I experimented a little a made some more changes, I also included the whitelist from the original script again.

I noticed that using basename would only work when the current oath exactly ends like that, so I tried find() instead because if we talk hidden folders like @eadir it will be somewhere in the middle of the path string for each file.

def is_allowed_path(filepath, is_dir = True):
    # Don't check the extension for directories
    if not is_dir:
        ext = os.path.splitext(filepath)[1][1:].lower()
        if ext in excluded_exts:
            return False
        if ext not in allowed_exts:
            return False
    if os.path.basename(filepath).startswith('.'):
        return False
    if not filepath.find("/@eaDir")  == -1:
        return False
    # exclude .sync directories
    if not filepath.find("/.sync")  == -1:
        return False
    if not filepath.find("/.stfolder")  == -1:
        return False
    #if os.path.basename(filepath) == b".sync":
    #    return False
    return True

Maybe there is a more sleek way to exclude all hidden files and folders with just one line. I tried if not filepath.find("/.") == -1 but i think it did not work for some reason.

I just saw I still have this one in there, would have to test if that one was working: if os.path.basename(filepath).startswith('.'):

As it is now I think I might be doing the same thing multiple times!?

letorbi commented 4 years ago

Just checking the result of basename() should be enough, because is_allowed_path() should be called for every folder and file in the watched directory-tree. So if ./foo/bar/unwanted has returned false, the path ./foo/bar/unwanted/baz should not be watched at all.

Regarding the white-list: I removed it by intent, because I think for an unexperienced user it is better to have some "false positives" in the media-list, rather than missing some wanted media-files. But my plan is to re-implement a white list once Synoindex Watcher has a config-file (see #11 ).

Thus my untested suggestion for a solution would be:

def is_allowed_path(filepath, is_dir = True):
    name = os.path.basename(filepath)
    # Don't watch hidden files and folders
    if name[0] == b'.':
        return False
    # Don't watch special files and folders
    if name[0] == b'@':
        return False
    # Don't check the extension for directories
    if not is_dir:
        ext = os.path.splitext(filepath)[1][1:].lower()
        if ext in excluded_exts:
            return False
    return True

This should block all files and folder starting with . (hidden) or @ (special). Would be great if you could test if this works as expected.

If ./foo/bar/unwanted/baz is watched, this would actually be a bug in the inotifyrecurive package. Please file a bug there. Inotifyrecursive is also maintained by me and its main purpose is to act as a support-package for Synoindex Watcher.

pmcl77 commented 4 years ago

Thanks, that's way more sleeker and I think (so far) it is working.

I quickly synced some photos in my photo folder and it indext the image correctly while no @eadir showed up in the log, there was the warning for .SYNOPPSDB that it is not an allowed path (so ok as well). I am just curious, why there is no warning for the @eadir folder... even a manual copy of a file there would not trigger the warning, any idea why this could be? it didn't index either from that @eadir copy so it seems to be working).

I will check the logs after a few days to see if anything unwanted is there.

I understand your point about only having a black-list. I will probably keep the whitelist as I usually know what filetypes I send to my photo folder (I only use it for photo station because I use resilio sync to upload the photos from my mobile and computer and it does not trigger indexing).

letorbi commented 4 years ago

Well, I've just tested the code and actually there seems to be a bug in inotifyrecursive, because ./foo/bar/unwanted/baz is actually watched. I've filed a bug-report for that.

letorbi commented 4 years ago

I've implemented my solution and also fixed the bug in the inotifyrecursive package. Thanks again for reporting in general and especially for the hint about the strange debug-logs.

pmcl77 commented 4 years ago

Cool, do I have to update inotify somehow?

letorbi commented 4 years ago

AFAIK just updating Synoindex Watcher via:

sudo python -m pip install --upgrade https://github.com/letorbi/synoindexwatcher/archive/master.zip

should be enough, because pip updates the required dependencies as well.

pmcl77 commented 4 years ago

I stopped it, updated it, restarted... tested and got the following in the log

2019-11-21 23:09:21,534 INFO Process received SIGTERM signal
2019-11-21 23:10:07,801 ERROR An uncaught exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/synoindexwatcher/__main__.py", line 121, in <module>
    start()
  File "/usr/lib/python2.7/site-packages/synoindexwatcher/__main__.py", line 91, in start
    inotify.add_watch_recursive(b"/volume1/music", mask, is_allowed_path)
  File "/usr/lib/python2.7/site-packages/inotifyrecursive/inotifyrecursive.py", line 111, in add_watch_recursive
    return self.__add_watch_recursive(path, mask, filter, name, -1, False)
  File "/usr/lib/python2.7/site-packages/inotifyrecursive/inotifyrecursive.py", line 74, in __add_watch_recursive
    wd = inotify_simple.INotify.add_watch(self, path, mask | flags.IGNORED | flags.CREATE | flags.MOVED_FROM | flags.MOVED_TO)
  File "/usr/lib/python2.7/site-packages/inotify_simple/inotify_simple.py", line 109, in add_watch
    return _libc_call(_libc.inotify_add_watch, self.fd, path, mask)
  File "/usr/lib/python2.7/site-packages/inotify_simple/inotify_simple.py", line 73, in _libc_call
    raise OSError(errno, os.strerror(errno))
OSError: [Errno 2] No such file or directory

after I corrected the media path (i use volume2 and 3) I still got an error:

2019-11-21 23:17:48,710 ERROR An uncaught exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/synoindexwatcher/__main__.py", line 129, in <module>
    start()
  File "/usr/lib/python2.7/site-packages/synoindexwatcher/__main__.py", line 98, in start
    inotify.add_watch_recursive(b"/volume3/photo", mask, is_allowed_path)
  File "/usr/lib/python2.7/site-packages/inotifyrecursive/inotifyrecursive.py", line 111, in add_watch_recursive
    return self.__add_watch_recursive(path, mask, filter, name, -1, False)
  File "/usr/lib/python2.7/site-packages/inotifyrecursive/inotifyrecursive.py", line 70, in __add_watch_recursive
    if filter != None and not filter(name, parent, True):
TypeError: is_allowed_path() takes at most 2 arguments (3 given)

strange enough I can not see that is_allowed_path() is called anywhere with 3 params....

just in case, those are my add_watches:

    inotify.add_watch_recursive(b"/volume3/photo", mask, is_allowed_path)
    inotify.add_watch_recursive(b"/volume2/homes/phil/photo", mask, is_allowed_path)
letorbi commented 4 years ago

This is strange, because the current implementation of is_allowed_path() actually takes three arguments and not two like the error messages says. Are you sure that you're using the latest version of the synoindexwatcher package?

The inotifyrecursive packages seems to be the latest version, otherwise the line if filter != None and not filter(name, parent, True): would not exist at all.

Could you run python2 -m pip list | grep -E "inotifyrecursive|synoindexwatcher" and paste the output here?

letorbi commented 4 years ago

BTW: is_allowed_path() is used as a filter function, which is called from inotifyrecusive (that's why it is passed as the third argument of the inotify.add_watch_recursive() calls).

So the erroneous line if filter != None and not filter(name, parent, True): actually tries to call is_allowed_path(name, parent, True).

pmcl77 commented 4 years ago

well, I installed it with the commands provided... remember I also posted I had issues with the pip. but in the end, I could install it, so unless you have changed the numbers of parameter within the last week, the installation instructions are pointing to an old version?

I tried to reinstall but now I get tons of errors when trying to ensure pip.

I downloaded from here to my pc and checked the main where it does take three params indeed... can I just copy over the main or will I miss something?

Or how can I get the installation to run smoothly?

pmcl77 commented 4 years ago

just did...


root@syno:~# wget https://bootstrap.pypa.io/get-pip.py
--2019-11-24 20:15:35--  https://bootstrap.pypa.io/get-pip.py
Resolving bootstrap.pypa.io... 2a04:4e42:9::175, 151.101.36.175
Connecting to bootstrap.pypa.io|2a04:4e42:9::175|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1775835 (1.7M) [text/x-python]
Saving to: 'get-pip.py'

get-pip.py          100%[===================>]   1.69M  10.3MB/s    in 0.2s

2019-11-24 20:15:36 (10.3 MB/s) - 'get-pip.py' saved [1775835/1775835]

root@syno:~# python get-pip.py
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Ple                                                             ase upgrade your Python as Python 2.7 won't be maintained after that date. A fut                                                             ure version of pip will drop support for Python 2.7. More details about Python 2                                                              support in pip, can be found at https://pip.pypa.io/en/latest/development/relea                                                             se-process/#python-2-support
Collecting pip
  Using cached https://files.pythonhosted.org/packages/00/b6/9cfa56b4081ad13874b                                                             0c6f96af8ce16cfbc1cb06bedf8e9164ce5551ec1/pip-19.3.1-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 19.3.1
    Uninstalling pip-19.3.1:
      Successfully uninstalled pip-19.3.1
Successfully installed pip-19.3.1
root@syno:~# sudo python -m pip install --upgrade https://github.com/letorbi/syn                                                             oindexwatcher/archive/master.zip
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Ple                                                             ase upgrade your Python as Python 2.7 won't be maintained after that date. A fut                                                             ure version of pip will drop support for Python 2.7. More details about Python 2                                                              support in pip, can be found at https://pip.pypa.io/en/latest/development/relea                                                             se-process/#python-2-support
Collecting https://github.com/letorbi/synoindexwatcher/archive/master.zip
  Downloading https://github.com/letorbi/synoindexwatcher/archive/master.zip
     / 122kB 1.0MB/s
Requirement already satisfied, skipping upgrade: inotifyrecursive>=0.2.2 in /usr                                                             /lib/python2.7/site-packages (from synoindexwatcher==0.7.2) (0.2.2)
Requirement already satisfied, skipping upgrade: enum in /usr/lib/python2.7/site                                                             -packages (from inotifyrecursive>=0.2.2->synoindexwatcher==0.7.2) (0.4.7)
Requirement already satisfied, skipping upgrade: inotify-simple in /usr/lib/pyth                                                             on2.7/site-packages (from inotifyrecursive>=0.2.2->synoindexwatcher==0.7.2) (1.1                                                             .8)
Requirement already satisfied, skipping upgrade: setuptools in /usr/lib/python2.                                                             7/site-packages (from enum->inotifyrecursive>=0.2.2->synoindexwatcher==0.7.2) (4                                                             1.6.0)
Building wheels for collected packages: synoindexwatcher
  Building wheel for synoindexwatcher (setup.py) ... done
  Created wheel for synoindexwatcher: filename=synoindexwatcher-0.7.2-cp27-none-                                                             any.whl size=18232 sha256=349dbf201fe1122db59aeccb6ec4ff05b87b34299cdcbaf701c100                                                             21825a0ee4
  Stored in directory: /tmp/pip-ephem-wheel-cache-QlbyoC/wheels/6c/da/c6/36f4dd2                                                             b0973e25c12c7de70d941c369f4ad4e9e8f986e191d
Successfully built synoindexwatcher
Installing collected packages: synoindexwatcher
  Found existing installation: synoindexwatcher 0.7.2
    Uninstalling synoindexwatcher-0.7.2:
      Successfully uninstalled synoindexwatcher-0.7.2
Successfully installed synoindexwatcher-0.7.2

then after starting the log shows...


2019-11-24 20:23:31,909 ERROR An uncaught exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/synoindexwatcher/__main__.py", line 124, in <module>
    start()
  File "/usr/lib/python2.7/site-packages/synoindexwatcher/__main__.py", line 93, in start
    inotify.add_watch_recursive(b"/volume1/music", mask, is_allowed_path)
  File "/usr/lib/python2.7/site-packages/inotifyrecursive/inotifyrecursive.py", line 111, in add_watch_recursive
    return self.__add_watch_recursive(path, mask, filter, name, -1, False)
  File "/usr/lib/python2.7/site-packages/inotifyrecursive/inotifyrecursive.py", line 74, in __add_watch_recursive
    wd = inotify_simple.INotify.add_watch(self, path, mask | flags.IGNORED | flags.CREATE | flags.MOVED_FROM | flags.MOVED_TO)
  File "/usr/lib/python2.7/site-packages/inotify_simple/inotify_simple.py", line 109, in add_watch
    return _libc_call(_libc.inotify_add_watch, self.fd, path, mask)
  File "/usr/lib/python2.7/site-packages/inotify_simple/inotify_simple.py", line 73, in _libc_call
    raise OSError(errno, os.strerror(errno))
OSError: [Errno 2] No such file or directory
pmcl77 commented 4 years ago

Ok, my bad, forgot to change the path in the add_watch section... EDIT: I guess that config file would come in handy when updating the scripts...

I also tried to install everythgin in python 3, which first seemed not to work but apparently I have the synoindexwatcher now running in both python3 and python 2.7 since the log gives me:


2019-11-24 21:13:28,522 INFO Watching for media file changes...
2019-11-24 21:15:10,885 INFO Watching for media file changes...
2019-11-24 21:15:45,374 INFO synoindex -a b'/volume2/homes/phil/photo/_phone_backup/camera_op6t/IMG_20191124_211528.jpg'
2019-11-24 21:15:45,374 INFO synoindex -a /volume2/homes/phil/photo/_phone_backup/camera_op6t/IMG_20191124_211528.jpg
2019-11-24 21:15:45,408 INFO synoindex -a /volume2/homes/phil/photo/_phone_backup/camera_op6t/IMG_20191124_211528.jpg
2019-11-24 21:15:45,408 INFO synoindex -a b'/volume2/homes/phil/photo/_phone_backup/camera_op6t/IMG_20191124_211528.jpg'

It seems python3 gives me a synoindex -a b, not sure why the files are added twice either.

to install in python 3 I used the commands from the install documentation but replaced python with python3 everywhere... also did that in the /usr/local/etc/rc.d/S99synoindexwatcher.sh

By using /usr/local/etc/rc.d/S99synoindexwatcher.sh stop I can only seem to stop one of the instances running. I am not sure how to stop the other one.

I am sorry for the complete mess, I am not very experienced with python.

EDIT: Restarted my DS and now it is running only under pythong 2.7 (as the S99 script is using python command). So the b in -a b when using python3 seems to be a bug I would guess?

EDIT2: other than that, it seems to work great now. It is not giving the warnings about non valid path anymore :)

letorbi commented 4 years ago

Ok, my bad, forgot to change the path in the add_watch section... EDIT: I guess that config file would come in handy when updating the scripts...

Yeah definitely. I'm working on this ;)

So the b in -a b when using python3 seems to be a bug I would guess?

Yeah, that's a bug caused by a major change in string-encoding between Python 2 and 3. I'm on it.

letorbi commented 4 years ago

The string-encoding bug should be fixed with 270979692bcf0408885bb71a42a9ce0202532931. Just upgrade and Synoindex Watcher should work with Python 3 again.