ConfusedPolarBear / intro-skipper

Fingerprint audio to automatically detect and skip intro sequences in Jellyfin
GNU General Public License v3.0
987 stars 396 forks source link

[Feature] Analysie Chapter data to detect Intro and outros #58

Closed themegaphoenix closed 1 year ago

themegaphoenix commented 2 years ago

Describe the feature you'd like added Some videos already have chapters metadata. By using the chapters, the scanning could be sped up by either looking for a certain pattern in the chapters name i.e. Intro, OP,ED etc or by cutting down the analysis length by filtering the chapters. i.e. we know that the plugin currently looks for an intro that is Located within the first 25% of an episode, or the first 10 minutes, whichever is smaller. Lets imagine there is a file that which has 3 chapters in the first 25% of an episode:

  1. The first chapter would cover 10% of the episode (0-10%)
  2. The second chapter would cover the next 10% (10-20%)
  3. The third chapter would cover the next 30% (20-50%)

Now, most likely the intro would be in the second chapter, after the prologue. By scanning that portion first we have reduced the scanning from 25% to 10%. If it fails to detect in the second chapter, then it should check the remaining chapters. At best we reduce the scanning and fingerprinting time and at worst we don't gain any performance out it, but delay by increasing the time it takes to get the chapter information (which would be almost negligible compared to fingerprinting audio)

Additional context FFMPEG to get the chapters: ffprobe -i fname -print_format json -show_chapters -loglevel error

Weevild commented 2 years ago

I think that this is a good concept, though I'd like to poke at some things. Firstly, not all first chapters will be a prologue and the second chapter might span a length of several minutes. The ideal solution in my honest opinion would be to check for the shortest chapter in the first half of the episode. The procedure would be something like this:

  1. Acquire the total runtime and divide by two.
  2. Check all chapters within the first half's time span for the one with the shortest runtime.
  3. Repeat this for all episodes and start analyzing those chapters' audio content.

All of this is assuming that the media file has a chapter dedicated to the opening (but could potentially be used for an outro as well). If it's just Chapter A, Chapter B, Chapter C every 5 minutes or that one of those early chapters disregards the intro and just includes it then this whole idea falls apart.

Where I see this being eminently useful is in situations involving anime. You'd have to really look for an anime release out there not having a chapter titled "Intro, OP, Opening" etc, since the opening sequence is essential to an anime, it's a part of the product.

If the media file doesn't have a (short or titled) chapter that matched other episodes opening audio fingerprint then just scan the rest of the first 25% or 10 min (whichever is less) as stated.

themegaphoenix commented 2 years ago
  • Check all chapters within the first half's time span for the one with the shortest runtime.

that would have a flaw because some releases have the Netflix/Prime intro logo. My solution is that the second chapter is statistically most likely to the intro (specially for the anime). But if it is not, then it would just just search the first, the third etc.

I guess there is no wrong or right answer it will all depend on the method.

A smarter method would be a hybrid method which would combine both methods, with the default as the backup.

Anything that would help to cut the first 25% or 10min as stated would result in a faster scan and improvement.

Weevild commented 2 years ago

A smarter method would be a hybrid method which would combine both methods, with the default as the backup.

You mean like a profile? I don't know if there are any more distinguishable types of media other than anime and western media (perhaps it has definitive sub genres, I'm not well versed in this area). Perhaps k-dramas has a "structural type" etc. Anyhow I could see an "anime profile" with its specific, stand-alone parameter be a potential solution.

That would have a flaw because some releases have the Netflix/Prime intro logo

Not at all actually. As stated by the plugin's author: it must be "At least 15 seconds long". All of those Disney, Hulu, Netflix and others are definitely under 15 seconds.

My solution is that the second chapter is statistically most likely to the intro (specially for the anime)

You're probably right about this one. I skimmed through 5 or so shows in my library and the opening chapter was the second one. So the logic system could be:

  1. Check shortest chapter runtime in the first half
  2. Check if the second chapter is the shortest (these two steps could be done in any order)
  3. If they match tell the program that there's a very high probability of this chapter containing the intro sequence

Then again in the case of anime the intro chapter is in, without overexaggerating, 98% of cases named as such. So the first step could be to check chapter names.

themegaphoenix commented 2 years ago

That would have a flaw because some releases have the Netflix/Prime intro logo

Not at all actually. As stated by the plugin's author: it must be "At least 15 seconds long". All of those Disney, Hulu, Netflix and others are definitely under 15 seconds.

Oops it seems I have missed that

My solution is that the second chapter is statistically most likely to the intro (specially for the anime)

You're probably right about this one. I skimmed through 5 or so shows in my library and the opening chapter was the second one. So the logic system could be:

  1. Check shortest chapter runtime in the first half
  2. Check if the second chapter is the shortest (these two steps could be done in any order)
  3. If they match tell the program that there's a very high probability of this chapter containing the intro sequence

Then again in the case of anime the intro chapter is in, without overexaggerating, 98% of cases named as such. So the first step could be to check chapter names.

In conclusion the best order is probably:

  1. Check the chapter names for intro, OP, etc
  2. Check shortest chapter runtime in the first half
  3. Check if the second chapter is the shortest (these two steps could be done in any order)
  4. If they match tell the program that there's a very high probability of this chapter containing the intro sequence
  5. check other chapters
  6. default to the first 25% or 10 min whichever is the lowest
Weevild commented 2 years ago

In conclusion the best order is probably:

  1. Check the chapter names for intro, OP, etc
  2. Check shortest chapter runtime in the first half
  3. Check if the second chapter is the shortest (these two steps could be done in any order)
  4. If they match tell the program that there's a very high probability of this chapter containing the intro sequence
  5. check other chapters
  6. default to the first 25% or 10 min whichever is the lowest

Yes this is how I would do it, but I'm not developing this plugin so would be nice if @ConfusedPolarBear could share some input as well.

Yankees4life commented 2 years ago

Sometimes some anime files does have chapters with the opening and the ending marked so I too share same sentiment of using that to skip intros with this plugin instead of wasting an instance of ffmpeg analyzing a file.