Open mjclarke94 opened 2 years ago
this guy did some work on syncing.
https://www.reddit.com/r/F1TV/comments/tggpe8/i_wrote_an_autosynchronizer_for_f1viewer/
I've used the english audio track to synchronize the streams by detecting the silence when the track switches.. it's not perfect but close enough for 2021 and 2022 it seems.
I've been looking at other ways and it seems there is a lack of metadata information, although I have found that the web viewer (which uses v1 APIs in firefox) seems to do a decent job from what I can tell. The media does have a timing track of type id3v2_priv.com.apple.streaming.transportStreamTimestamp I cannot get them to line up across tracks and I suspect they are different clocks for different streams.
Another idea that I've seen elsewhere is using a json file and use that information. I've written my code in a way that should be able to check for a json and pull from there before using the silence detect if we decide this is a decent approach.
Do you have an example of a really poorly sync'ed video I can use? I think I have a method for matching them up with metadata finally.
Before season 2020, there is no timing data to use. 2020 and beyond has a timing track that can be used to calculate the offset. I think the best we can hope for is to use the audio technique I have in place now for earlier seasons. Season 2018 does not work as the code is written because there are 3 audio tracks marked ENG: 1 - fx, 2 - team radio, 3 - the announcements.. Season 2019 has almost the same issue with 2 ENG audio tracks: 1 - fx, 2 - the announcements. Selecting FRE (fall back to FRA) would fix these issues and appears to be the most common language besides english.
The other issue with the detection now is attempting to fetch the stream 10 seconds at a time. This should be rewritten to run for a max of 1 minute and kill the command once the first audio silence ends.
As for newer seasons, the metadata from the timed_id3 track can be used. This method should be sufficiently faster as the amount of data to fetch should be much smaller. It should also be more accurate. The timestamps are at consistent intervals so reading the first one and calculating the delta will get the offset needed. I plan to use a 2d array of unique identifiers, perhaps the URLs and the timestamps while keeping track of the newest timestamp then calculate the offset based off that time.
Extracting the information from the data stream works with this ffmpeg command:
ffmpeg -i <url> -map 0:d:1 -dcodec copy -fs 64 -f data pipe:1 -loglevel error
Ideally, I'd like to find a command that takes the -map 0:d:1 out and substitutes it for selecting the first timed_id3 data track - or fail. Upon failure, fall back to the silence detection used today, but the above should work for 2020 to 2022.
I've been working on this more and it seems better to pause the launching based on the time delta. This way it will work for live streams. I'm going to use a goroutine to launch the retrieval all in parallel so that the timestamps are as close together as possible.
I'm also working through making the audio detection better as discussed in the previous comment. I need to monitor the output of the command for the end of silence and also select the english announcements as they are the most reliable for start time.
There's already a PR https://github.com/SoMuchForSubtlety/f1viewer/pull/215 trying to achieve this 🚀
Yes, I wrote it.
I'm not happy with using just audio to synchronize as some streams actually start after the audio, so it won't work for those.
The other issue is the video players sometimes don't seem to honor less than a second accuracy so sleeping prior to launch seems to work better in my testing.
There is also the issue of seeking to time 0 in a live stream - that seems to cause errors.
I'm not sure it is possible to get perfect synchronization without an overall controlling application to monitor the stream timing data. Launch times of the streams vary - F1 Live vs Driver view start times are up to a second in difference. The other issue is that seeking to a sub-second doesn't really work well for video players so timing the launch seems the way to go, but the variation in launch time makes this solution challenging without players that accept controls externally. It could be accomplished, perhaps, on linux using an mpv addon for socket control or using signals to pause/start the execution.
For instance, the data I have collected on time stamps is as follows for 2022 Bahrain:
Times | F1 Live view | Sainz view
m3u8 start | 55:02.08 | 55.02.629
Timestamp 1 | 55.09.000 | 55.05.44
Timestamp 2 | 55.18.960 | 55.15.460
TS1 - (TS2 - TS1) | 54.99.04 | 54.95.42
Time stamps seem to be inserted at regular intervals, so it would stand to reason that the missing beginning timestamp is not needed as that is the start time (which is indicated to be 2 seconds and not 0 by ffprobe) I would expect the times of these two streams to be off by 3.62 seconds, but in actuality, they seem to be within a second of each other.
It's difficult when watching multiple streams to get them all properly synchronised. I'm not sure what the best approach here would be but opening the issue for discussion sake.