Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser
https://joinpeertube.org/
GNU Affero General Public License v3.0
13k stars 1.49k forks source link

Alternate audio tracks #939

Open drequivalent opened 6 years ago

drequivalent commented 6 years ago

What if Peertube videos supported alternate audio tracks? Like, the way we have subtitles?

Or even better, audio track blending!

And also, the suggestion mechanism for audio tracks.

It would be very useful for dub translations into another languages. This feature is badly missing from YouTube, which results in a lot of unnecessary duplication of the same video with very varying video quality.

If we have this feature, we always get the exact same video, but we can choose an audio track we prefer, or even have it selected automatically according to browser locale. Great for watching documentaries and open movies.

ealgase commented 5 years ago

Please add this, it would be amazing!

elevenpassin commented 5 years ago

Interesting concept indeed :O

StalinEXE commented 5 years ago

+1 I really want to create some videos for English and Russian speaking audience (I really ready to dub video twice in these two languages). It is a really cool feature that will help to avoid creating two different videos in two languages.

Just open the video and click on English or Russian in interface. But also we need an abillity to put description, title and thumbs in different languages for audience too. I like this idea and hope to see it in PeerTube.

rkingett commented 4 years ago

I hope this gets added! YouTube does not have this feature in any way. It would help so many create dubs and even audio description tracks, rather than uploading multiple videos.

bkil commented 3 years ago

It would be interesting to consider what you would set the language metadata field to? Maybe convert it to a list? That could confuse video search engines, though.

zsolt-beringer commented 3 years ago

@bkil A list or set would be a good solution. Every search query containing a language field related filtering that matches any of the listed audio languages of a video in the set would match.

bkil commented 3 years ago

@zsolt-beringer This is how the field currently looks like:

  "data": [
    {
      "privacy": {
        "id": 1,
        "label": "Public"
      },
      "language": {
        "id": "hu",
        "label": "Hungarian"
      },
      ...

I think if we didn't want to break downstream users, it would be wise to keep this as is to convey the default language and add a new key (like languages or audioTracks) that contains a list. I wouldn't necessarily limit ourselves to using a json map. I could imagine wanting to create two different audio tracks in the same language, for example one containing the original noisy recording made at a conference and a second one rerecorded in a studio environment that is less hurtful to the ears.

bkil commented 3 years ago

Surely a proper solution would be a bit costly, although I think it is worth it. This could come either by storing the tracks separately and combining them in the player (#1448), or implementing a much more complicated bitstream parser/demultiplexer as well. I haven't looked at the code, but I have a feeling this would take a while.

I could imagine a bit hacky, but cheap minimal viable solution until then. For example, we could just transcode and store the given entire video file multiple times (once per audio track), and the player would choose one file based on the required audio track. In the vast majority of use cases, people usually upload content in at most two languages, hence the overhead isn't that excessive.

Even if an optimized backend would be implemented at a later time, most of the user facing features would still work afterwards. The video's link would be stable, description and comments would be combined, content recommendation would stay unified. Though, the comment section made me wondering - is it worthwhile at all to cram comment together from different languages below a given video?

Or would the end goal be more motivated more by wanting to share video streams to save storage costs (and to improve seeding)? If the end goal is more about seeding, it would improve the swarm a lot if all languages were kept in the same file (and/or all of them would be downloaded together).

As a less redundant, but much more hacky workaround, the video file could be stored once with the default audio track, and the alternate audio tracks could be stored in separate files. Now comes the kludge: if a user selects a different audio track, the player would mute the main file and start playing the alternate audio track in the background, keeping the playback of the muted video and the separate audio in sync. Sounds brittle, but it might just work.

mmuman commented 3 years ago

For the record, most common A/V container formats (MP4, AVI…) handle multiple audio tracks with language IDs just fine. But supporting only this way means all the tracks we want must already be in the initial file.

kontrollanten commented 3 years ago

Another positive upside with separating audio from video would be that playing audio in background on mobile would work. Since HLS supports alternate audio tracks there's no need to care about syncing the tracks, and there's still possible to download the videos since most video players supporting playing m3u8 files.

rigelk commented 3 years ago

In Video.js, we only allow one track to be enabled at a time; so, if you enable more than one, the last one to be enabled will end up being the only one. While the spec allows for more than one track to be enabled, Safari and most implementations only allow one audio track to be enabled at a time.

source: https://docs.videojs.com/tutorial-audio-tracks.html

That means audio track blending won't be a thing for us any time soon, unless the change happens upstream at videojs.

kontrollanten commented 3 years ago

In Video.js, we only allow one track to be enabled at a time; so, if you enable more than one, the last one to be enabled will end up being the only one. While the spec allows for more than one track to be enabled, Safari and most implementations only allow one audio track to be enabled at a time.

source: https://docs.videojs.com/tutorial-audio-tracks.html

That means audio track blending won't be a thing for us any time soon, unless the change happens upstream at videojs.

I understand that as they're talking about audio tracks, not tracks in general:

Audio tracks are a feature of HTML5 video for providing alternate audio track selections to the user, so that a track other than the main track can be played.

According to this demo it seems to work with audio and video tracks enabled at the same time https://codepen.io/team/rcrooks1969/pen/LoWNQy

rigelk commented 3 years ago

That means audio track blending won't be a thing for us any time soon, unless the change happens upstream at videojs.

I understand that as they're talking about audio tracks, not tracks in general:

I'm talking about audio tracks too ; more specifically audio track blending (meaning trying to have more than one audio track enabled at once).

bkil commented 3 years ago

I don't think that this issue involves audio track blending. What we would need is for the user to be able to choose an audio track in their own language, preferably chosen automatically by browser language or PeerTube account preferences after first load.

rigelk commented 3 years ago

@bkil I'm merely answering the second line of the issue description.

jadsongmatos commented 3 years ago

an alternative to avoid loading two audios at the same time would be during the video upload, the audio would be removed from the video and another audio file would be created and a player would be responsible for synchronizing the two at the time of playback.

if the video has no audio, webp or gif or jpg formats can be used

jpg would be the most interesting, it would be easier to choose which frame to send, as it is separated into different files, it also has a progressive jpeg feature that does not need separate resolution sizes. still in the example jpg it would be possible to assign a hash to each frame to avoid duplication of files, but for that it would have to be created a file with the hash sequence.

GIF example ezgif com-gif-maker

NoMoreCRAPTion commented 2 years ago

I hope this gets added! YouTube does not have this feature in any way. It would help so many create dubs and even audio description tracks, rather than uploading multiple videos.

YouTube now has this feature. :)

"How YouTube's new audio feature is helping Ubisoft make its trailers more accessible"

https://www.techradar.com/news/how-youtubes-new-audio-feature-is-helping-ubisoft-make-its-trailers-more-accessible

Really weird Ubisoft has to get involved because this site existed for a long time: https://youdescribe.org/, and James Rath asked YouTube to add this feature a long time ago: https://welleyenever.com/2016/09/03/audio-description-on-youtube/

"YouTube Testing Language Dubbing Tool, Enabling Viewers To Toggle Between Multiple ‘Audio Tracks’"

https://www.tubefilter.com/2021/04/12/youtube-testing-language-dubbing-tool-audio-tracks/

MrBeast added Spanish dubbing to his video.

I do know for a long time in the fansubbing / fandubbing communities, it wasn't unusual for people to allow watchers to download OPUS files for their VLC player / MPV player.

rkingett commented 2 years ago

This is great YouTube added it. Even so, this could allow for creators to do many different things with their channel.

whyronimus commented 2 years ago

I only can support the idea given here. It's no problem to include multiple audio tracks in video files.

Two more suggestions:

jadsongmatos commented 2 years ago

I only can support the idea given here. It's no problem to include multiple audio tracks in video files.

Two more suggestions:

  • It should be possible to specify the labels of alternate audio tracks, e.g. "Stereo", "Headhone Stereo", "5.1" when uploading or updating a video
  • as indicated, other formats than stereo should be possible

I believe the labels depend on the video file type.

whyronimus commented 2 years ago

I believe the labels depend on the video file type.

The publishing form could read out how many audio streams are in the file, and save one label per audio stream which should be able to be chosen in the player. I am thinking it as the label being saved in the database, not the video file (I don't even know if this is possible)

whyronimus commented 2 years ago

@Chocobozzz could you give any information if this feature has been discussed or added to the roadmap?

It should not be difficult to implement, as far as I understood the videojs player is used. Tracks could be read from the video file with ffprobe, and added to the player as described here or here. Ideally, ffprobe should read out the metadata such as title from the audio track and propose it in the publishing form, but it should be editable.

I haven't found any solution for Audio Track blending or simultaneous playback, but this should not be necessary IMHO.

jadsongmatos commented 2 years ago

would it be possible to use tar file with video separated more inside the same tar?

ghost commented 2 years ago

Would like to see this being implemented, with the ability to add multiple audio tracks to an existing video at any time. YouTube allows background tracks to be added to a published video. A similar implementation would be ideal.

Videos here have multiple audio tracks as translations. A Video.js based player? .. not sure.

whyronimus commented 2 years ago

@njohng good idea, this would be possible with the -map command for ffmpeg. If anyone here is willing to give me some basic guidance, I could try to write a plugin for it. Never done before, but willing.

Short Annex: The player is capable of surround formats. Just tested successfully with a 5.0 file.

Agorise commented 12 months ago

This is BADLY needed. I should be able to switch to different subtitle languages as easily as switching the audio to a different language as well. I swear I will dump my dvd collection once Peertube has this feature :)

stevespaw commented 11 months ago

This is related to this issue - and ROKU compatibility. If it were possible to adjust the FFMPEG output options like the Encoding plugin can do for input/transcode properties, we may be able to get there. If we can get this Multi Audio hurdle done, Peertube could really dominate some projects! https://github.com/Chocobozzz/PeerTube/issues/2625#issuecomment-1548080453

uis246 commented 7 months ago

Or would the end goal be more motivated more by wanting to share video streams to save storage costs (and to improve seeding)? If the end goal is more about seeding, it would improve the swarm a lot if all languages were kept in the same file (and/or all of them would be downloaded together).

I hope decoder can selectively fetch needed audio stream only. Also adding new languages becomes problematic, since you need to pack new file(at least don't need to transcode).

Now comes the kludge: if a user selects a different audio track, the player would mute the main file and start playing the alternate audio track in the background, keeping the playback of the muted video and the separate audio in sync. Sounds brittle, but it might just work.

Sounds like average webdev hack. Just get current timestamp decoded and start playing another language from that timestamp. Assuming audio file has samplerate and integer offset from zero in samples it will be simple stopping decoding initial audio stream and starting decoding new audio stream from offset(keep in mind seeking sometimes can be expensive). If audio streams have integer samplerate and seconds-aligned offset, then wait until next second starts and replace tracks. Everyhing else is tricky, but also doable because pauses and seeking doesn't desync video and audio.

dannekrose commented 6 months ago

I’d like to add my support and bump this issue. Is there anything the average person can do to help?

stevespaw commented 6 months ago

This is the Roku problem that is also related - Fragmented MP4, CMAF (muxing audio and video not supported for CMAF).. https://developer.roku.com/docs/specs/media/streaming-specifications.md

FediVideos commented 2 weeks ago

Does the latest 6.3.0 update which separates video and audio allow multiple audio tracks on a single video? Asking as there's a group which does audio description for video who would like to try doing this on PeerTube, and multiple audio would allow this to happen.

(For those who don't know, audio description is where there's an alternate audio track where the on-screen action is described for people with visual impairments but the dialogue is left intact.)

uis246 commented 2 weeks ago

(For those who don't know, audio description is where there's an alternate audio track where the on-screen action is described for people with visual impairments but the dialogue is left intact.)

This is even more important than localization. I didn't know it existed.

rkingett commented 2 weeks ago

Here is a sample of audio description

candidexmedia commented 2 weeks ago

Here's an example of how the AblePlayer video player handles video descriptions: https://ableplayer.github.io/ableplayer/demos/desc1.html

Chocobozzz commented 1 week ago

Does the latest 6.3.0 update which separates video and audio allow multiple audio tracks on a single video?

Not yet, but it was a necessary step if we implement alternate audio tracks in the future