Open Fallenbagel opened 2 years ago
This sounds really cool, and I'd definitely be interested in incorporating audio fingerprinting into this project.
Initially when I was researching options for this project I looked a bw_plex
(https://github.com/Hellowlol/bw_plex) which looks up intros on youtube and uses those for comparison (I think it does both audio and video fingerprinting).
I think both visual and audio fingerprinting are valuable approaches, and if possible using both sounds like the best thing to do. If it's audio fingerprinting that's fastest, then it could be the "first pass" approach, and maybe in the majority of cases it will be enough.
I'm happy to help how I can. Maybe a good place to start would be to make the code more modular by making the fingerprint retrieval and comparison parts modules so they can be swapped out.
Yes using both visuals and audio sounds much better. This could ultimately increase the accuracy as well.
I will have to go over the code and see if I can make the code modular (I am a complete noob in go language but I could learn as I would love to help out with this project)
There is a fork of the main matcher
project which uses fingerprints borovikv/matcher
I noticed that matcher
doesn't have a license. I recommend opening an issue on the original repo and requesting that the author adds one. You could include this link https://choosealicense.com/ which has some tips on choosing a license.
In the context of integrating code from matcher into this project, it's necessary that the license is compatible with the license that this project uses (AGPL3).
"AGPLv3 is compatible with the following licenses: GPLv3, GPLv2 (or any later version), LGPLv3, LGPLv2.1 (or any later version), BSD license (2-clause and 3-clause), MIT license and Apache License 2.0."
I'm not sure about the fork though since it was made unlicensed, which puts it in a weird place (unlicensed means they didn't have permission to copy/modify it even though the source is there to read). Assuming the original project adds a compatible license the best thing would probably be to open an issue on the fork and ask them to add the same license that the original project has. That's just my best interpretation of what I've read though, not a lawyer lol.
I have created an issue in the main project for the license. Lets hope the creator replies and adds a compatible license.
Meanwhile, I'll try to look for another similar project. I have found some audio fingerprinting projects with MIT license but they work more like shazam. However, since they are python scripts utilizing audio fingerprints, I think that part could be incorporated into this project somehow, maybe. And another one that was written in c# audio fingerprinting called soundfingerprinting. And there's also a script for this project with documentation on how to get the reliable intro timestamps (and mark chapters which we could omit and incorporate the fingerprints that is generated into this project). And another script on the same project which does the intro detection automatically (as the previous script has to be supplied with the intro.wav) the project is MIT as well!
Also, there is no issue button in the forked one of matcher
.
Edit: There is this project as well. Although it does video fingerprinting (and is rather slow as it resizes the whole videos), the intro detection is far more accurate (for example, current project fingerprinted the wrong timestamps for 6th season of the series called house where the skipping started too early, however, when I ran the same season with this, it produced the exact timestamps of the intro start and end)
In my research on this, it looks like a pretty commonly used tool to calculate the audio fingerprint is fpcalc, and there are wrappers for it in just about every major programming language. I don't think generating the fingerprints is going to be a challenge. The more difficult part is then determining from the fingerprints where the intros are located. Unfortunately, most examples are focused on comparing two identical pieces of audio, or determining if a piece of audio is a subset of another (like Shazam). I have some ideas about how to accomplish this, and I plan to start working on it next week, likely in Python for integration into this project.
Longer term a Jellyfin plugin, or native integration, would require a re-write in C# and probably moving away from a monitor service to creating a jellyfin-intro chapter in the video that could trigger a skip-into button in the player.
For the short term, I would just be thrilled to have an accurate, quick-running solution today.
Awesome. Yeah I'm basically open to taking this project in any direction that's a sensible balance of accuracy and speed.
Regarding a jellyfin plugin, I agree that it is the ideal way to integrate the intro data into jellyfin. The auto skipping script is a stop-gap and I look forward to being able to move past it. I think the first step related to that is implementing server-side support in the database and server API.
We have discussed implementation options in the jellyfin matrix chat room and floated a few ideas. One popular option is to add support for parsing EDLs (https://kodi.wiki/view/Edit_decision_list) and have a structure that can store those events and send them to clients. Then the intro data would use either a type defined by the Kodi EDL spec or a new type.
But in the short term, I don't want to get too far ahead of where we are now. My current focus is on producing reliable data, and there's plenty to be desired there. But as I said, I invite any suggestions, discussions, and code contributions. 🙂
(for example, current project fingerprinted the wrong timestamps for 6th season of the series called house where the skipping started too early
@Fallenbagel can you test season six of house again with the latest version? You're talking about House MD, right? I only have season 1 and it also wasn't processed correctly. With the changes from #8 that I just merged it finds the correct timestamps for season 1
@mueslimak3r I tried to test it but since the new update, I haven't been able to run the script. I keep getting an error FileNotFoundError: [Errno 2] No such file or directory: '/mnt/storage/introskip/tv-intro-detection/config/data/fingerprints/rclonemediaSeriesMoonKnight2022imdbidtt10234724Season01S01E01TheGoldfishProblemWEBDL1080pmkv/fingerprint.txt'
and then the script terminates itself.
And if I remove the moonknight, then it tries with another series gets the same error. It only seems to be happening when I run the jellyfin.py
. Then I thought it might be a path map problem but the path seems to be correct as well. Then I ran decode.py
and that seems to find the fingerprints. It only happens with the jellyfin.py
EDIT: I just downloaded the fork from Cookie-Monster-Coder
which has a 3/4 days old version and that seems to be working correctly. However, when I run this script I get that error
EDIT 2: I just tested with a fresh clone again (and one more time on another machine). It seems that whenever I run the script, it is deleting the fingerprints
folder. The jellyfin_cache
folder remains, however, is not populated either.
did you update path_map.py
to use ::
instead of :
to delimit the two halves of each mapping? I changed it to allow for windows path compatibility
did you update
path_map.py
to use::
instead of:
to delimit the two halves of each mapping? I changed it to allow for windows path compatibility
Yes.
EDIT: Okay I might have mapped it wrong with the rclone files as I did not map it to the absolute path. Its working now. However, does the script now not skip paths that are not mapped? Because previously I ran the script by excluding any cloud media as it will be slow and also to avoid the api limit but now it seems to refuse to skip over any media that is not mapped, thereby not being able to test house MD because it seems to start with cloud files first (and I have house MD on my hard, locally. Which I used to previously achieve by only mapping the local path)
does the script now not skip paths that are not mapped? Because previously I ran the script by excluding any cloud media as it will be slow and also to avoid the api limit but now it seems to refuse to skip over any media that is not mapped, thereby not being able to test house MD because it seems to start with cloud files first
I'll look intro that. If files can't be found then it should skip them, and if that's the case for an entire season then it should skip it altogether and never call decode.py
. I'll push some changes in like an hour
It's very possible that when I migrated from os.path
to pathlib Path
I messed up a if not path.exists()
It's very possible that when I migrated from os.path to pathlib Path I messed up a if not path.exists()
Ah. I did notice that.
Upon testing what I could for now, which is batman: the animated series, its now detecting the intro to start at 0:05:36
and end at 0:06:41
for some episodes whereas the actual intro starts at 0:00:00
and ends around 0:01:05
. At first I thought it was just one mistake but it seems like every other episode is being misidentified 🤔 (some episodes seems to be identified correctly). I checked out the timestamps and there doesn't seem to be any identical footage there. Let me run the script again to confirm whether the same thing happens.
EDIT: nope. Same thing happening I wonder why 🤔 Its literally doing it alternatively. 1st episode has the wrong, 2nd episode has the correct one, 3rd wrong, 4th correct, 5th wrong, 6th correct.....etc
Season 1 Season 2
with S01E01 & E02 as examples, is it reprocessing either of those?
I've seen this before too where it alternates between two almost identical times, but it was when testing some code that didn't end up working.
edit: I copied our discussion into a new issue so this one won't go off topic
second edit: I hid the 6 comments where we discussed the incorrect timestamps thing that was moved to its own issue
@mueslimak3r I found someone working on tv-skipper plugin with audio fingerprinting with a GPL license. Just thought you might find it interesting. Though his code is in c# not python.
If you are worried about eventually making your python code into a plugin. I might suggest you take a look at IronPython as a possible solution.
Is your feature request related to a problem? Please describe. I was wondering wouldn't it be better if audio fingerprinting is used instead of the hash method? It would be faster then.
Describe the solution you'd like I have tested an audio fingerprinting method and it seems to identify the intros in less than a minute and seems to be fairly accurate. In addition, it also solves the issue of the shows with different intro visuals but same audio.
There is an audio fingerprinting script written in go called (matcher) and I used this after extracting 10 minutes of the video file to find the timestamps. What took 8 minutes with hash matching, was reduced to less than a minute for a whole show. I have looked into incorporating the matcher script with the current python script and have gone over two methods: either reverse-engineering (which sounds to be more complicated) or the easier method of incorporating the go script into the current python script instead of the image hashing.
Just a suggestion to go over, no pressure c: