Deduplication algorithm works incorrectly on smooth transitions

Athari commented 3 years ago

Deduplication works fine when non-duplicated frames are very disctinct, however it completely removes all smooth transitions (fade from frame to frame).

Let's say we have frames with "values": 1, 20, 40, 50, 100, 101, 102, 103, 104, ..., 129, 130, 200, 230, 300. And we set frame difference threshold to 5.

Expected deduplicated sequence would be: 1, 20, 40, 50, 100, 105, 110, 115, 120, 125, 130, 200, 230, 300.

The sequence Flowframes produces is: 1, 20, 40, 50, 100, 200, 230, 300.

With the way the cycle in Dedupe.RemoveDupeFrames is implemented, every frame can remove all subsequent frames within the threshold, even if the frame itself has previously been removed by one of the preceding frames. This effectively completely removes all smooth transitions, though at worst, smooth transitions are to become choppy after deduplication.

If I understand correctly, the if (diff < threshold) path should increment both i and compareWithIndex (maybe increment in the for should be removed to account for that). I haven't tested it though.

n00mkrad commented 3 years ago

With the way the cycle in Dedupe.RemoveDupeFrames is implemented, every frame can remove all subsequent frames within the threshold, even if the frame itself has previously been removed by one of the preceding frames.

This is intended, otherwise it would never delete multiple dupes in a row.

Also, this code only runs when "Accurate" deduplication is used, by default the deduplication happens during extraction.

Athari commented 3 years ago

My changes to the algorithm don't restrict removal of long sequences of actual duplicates. An actual sequence with duplicates would look like this: 1, 20, 40, 50, 100, 101, 98, 102, 100, 101, 101, 99, 200, 230, 300 This would translate to this: 1, 20, 40, 50, 100, 200, 230, 300 Because all duplicates are within the threshold of the first frame.

At the same time, it would preserve smooth transitions (well, it'll still make them choppy, but RIFE should be able to fix this for the most part).

There's a risk of running into a sequence like this, of course: 1, 20, 40, 50, 97, 101, 98, 102, 103, 101, 101, 99, 200, 230, 300 Which would leave both 97 and 103 in the sequence: 1, 20, 40, 50, 97, 103, 200, 230, 300 But even in this worst case scenario, this problem can be fixed by a slight change to the threshold (5 to 6 in this case).

However, the current algorithm doesn't allow preserving smooth transitions and duplicate detection to happen at the same time at all.

n00mkrad / flowframes

Deduplication algorithm works incorrectly on smooth transitions #59