Open drequivalent opened 6 years ago
Please add this, it would be amazing!
Interesting concept indeed :O
+1 I really want to create some videos for English and Russian speaking audience (I really ready to dub video twice in these two languages). It is a really cool feature that will help to avoid creating two different videos in two languages.
Just open the video and click on English or Russian in interface. But also we need an abillity to put description, title and thumbs in different languages for audience too. I like this idea and hope to see it in PeerTube.
I hope this gets added! YouTube does not have this feature in any way. It would help so many create dubs and even audio description tracks, rather than uploading multiple videos.
It would be interesting to consider what you would set the language
metadata field to? Maybe convert it to a list? That could confuse video search engines, though.
@bkil A list or set would be a good solution. Every search query containing a language
field related filtering that matches any of the listed audio languages of a video in the set would match.
@zsolt-beringer This is how the field currently looks like:
"data": [
{
"privacy": {
"id": 1,
"label": "Public"
},
"language": {
"id": "hu",
"label": "Hungarian"
},
...
I think if we didn't want to break downstream users, it would be wise to keep this as is to convey the default language and add a new key (like languages
or audioTracks
) that contains a list. I wouldn't necessarily limit ourselves to using a json map. I could imagine wanting to create two different audio tracks in the same language, for example one containing the original noisy recording made at a conference and a second one rerecorded in a studio environment that is less hurtful to the ears.
Surely a proper solution would be a bit costly, although I think it is worth it. This could come either by storing the tracks separately and combining them in the player (#1448), or implementing a much more complicated bitstream parser/demultiplexer as well. I haven't looked at the code, but I have a feeling this would take a while.
I could imagine a bit hacky, but cheap minimal viable solution until then. For example, we could just transcode and store the given entire video file multiple times (once per audio track), and the player would choose one file based on the required audio track. In the vast majority of use cases, people usually upload content in at most two languages, hence the overhead isn't that excessive.
Even if an optimized backend would be implemented at a later time, most of the user facing features would still work afterwards. The video's link would be stable, description and comments would be combined, content recommendation would stay unified. Though, the comment section made me wondering - is it worthwhile at all to cram comment together from different languages below a given video?
Or would the end goal be more motivated more by wanting to share video streams to save storage costs (and to improve seeding)? If the end goal is more about seeding, it would improve the swarm a lot if all languages were kept in the same file (and/or all of them would be downloaded together).
As a less redundant, but much more hacky workaround, the video file could be stored once with the default audio track, and the alternate audio tracks could be stored in separate files. Now comes the kludge: if a user selects a different audio track, the player would mute the main file and start playing the alternate audio track in the background, keeping the playback of the muted video and the separate audio in sync. Sounds brittle, but it might just work.
For the record, most common A/V container formats (MP4, AVI…) handle multiple audio tracks with language IDs just fine. But supporting only this way means all the tracks we want must already be in the initial file.
Another positive upside with separating audio from video would be that playing audio in background on mobile would work. Since HLS supports alternate audio tracks there's no need to care about syncing the tracks, and there's still possible to download the videos since most video players supporting playing m3u8 files.
In Video.js, we only allow one track to be enabled at a time; so, if you enable more than one, the last one to be enabled will end up being the only one. While the spec allows for more than one track to be enabled, Safari and most implementations only allow one audio track to be enabled at a time.
source: https://docs.videojs.com/tutorial-audio-tracks.html
That means audio track blending won't be a thing for us any time soon, unless the change happens upstream at videojs.
In Video.js, we only allow one track to be enabled at a time; so, if you enable more than one, the last one to be enabled will end up being the only one. While the spec allows for more than one track to be enabled, Safari and most implementations only allow one audio track to be enabled at a time.
source: https://docs.videojs.com/tutorial-audio-tracks.html
That means audio track blending won't be a thing for us any time soon, unless the change happens upstream at videojs.
I understand that as they're talking about audio tracks, not tracks in general:
Audio tracks are a feature of HTML5 video for providing alternate audio track selections to the user, so that a track other than the main track can be played.
According to this demo it seems to work with audio and video tracks enabled at the same time https://codepen.io/team/rcrooks1969/pen/LoWNQy
That means audio track blending won't be a thing for us any time soon, unless the change happens upstream at videojs.
I understand that as they're talking about audio tracks, not tracks in general:
I'm talking about audio tracks too ; more specifically audio track blending (meaning trying to have more than one audio track enabled at once).
I don't think that this issue involves audio track blending. What we would need is for the user to be able to choose an audio track in their own language, preferably chosen automatically by browser language or PeerTube account preferences after first load.
@bkil I'm merely answering the second line of the issue description.
an alternative to avoid loading two audios at the same time would be during the video upload, the audio would be removed from the video and another audio file would be created and a player would be responsible for synchronizing the two at the time of playback.
if the video has no audio, webp or gif or jpg formats can be used
jpg would be the most interesting, it would be easier to choose which frame to send, as it is separated into different files, it also has a progressive jpeg feature that does not need separate resolution sizes. still in the example jpg it would be possible to assign a hash to each frame to avoid duplication of files, but for that it would have to be created a file with the hash sequence.
GIF example
I hope this gets added! YouTube does not have this feature in any way. It would help so many create dubs and even audio description tracks, rather than uploading multiple videos.
YouTube now has this feature. :)
"How YouTube's new audio feature is helping Ubisoft make its trailers more accessible"
Really weird Ubisoft has to get involved because this site existed for a long time: https://youdescribe.org/, and James Rath asked YouTube to add this feature a long time ago: https://welleyenever.com/2016/09/03/audio-description-on-youtube/
"YouTube Testing Language Dubbing Tool, Enabling Viewers To Toggle Between Multiple ‘Audio Tracks’"
https://www.tubefilter.com/2021/04/12/youtube-testing-language-dubbing-tool-audio-tracks/
MrBeast added Spanish dubbing to his video.
I do know for a long time in the fansubbing / fandubbing communities, it wasn't unusual for people to allow watchers to download OPUS files for their VLC player / MPV player.
This is great YouTube added it. Even so, this could allow for creators to do many different things with their channel.
I only can support the idea given here. It's no problem to include multiple audio tracks in video files.
Two more suggestions:
I only can support the idea given here. It's no problem to include multiple audio tracks in video files.
Two more suggestions:
- It should be possible to specify the labels of alternate audio tracks, e.g. "Stereo", "Headhone Stereo", "5.1" when uploading or updating a video
- as indicated, other formats than stereo should be possible
I believe the labels depend on the video file type.
I believe the labels depend on the video file type.
The publishing form could read out how many audio streams are in the file, and save one label per audio stream which should be able to be chosen in the player. I am thinking it as the label being saved in the database, not the video file (I don't even know if this is possible)
@Chocobozzz could you give any information if this feature has been discussed or added to the roadmap?
It should not be difficult to implement, as far as I understood the videojs player is used. Tracks could be read from the video file with ffprobe
, and added to the player as described here or here. Ideally, ffprobe
should read out the metadata such as title
from the audio track and propose it in the publishing form, but it should be editable.
I haven't found any solution for Audio Track blending or simultaneous playback, but this should not be necessary IMHO.
would it be possible to use tar file with video separated more inside the same tar?
Would like to see this being implemented, with the ability to add multiple audio tracks to an existing video at any time. YouTube allows background tracks to be added to a published video. A similar implementation would be ideal.
Videos here have multiple audio tracks as translations. A Video.js based player? .. not sure.
@njohng good idea, this would be possible with the -map
command for ffmpeg.
If anyone here is willing to give me some basic guidance, I could try to write a plugin for it. Never done before, but willing.
Short Annex: The player is capable of surround formats. Just tested successfully with a 5.0 file.
This is BADLY needed. I should be able to switch to different subtitle languages as easily as switching the audio to a different language as well. I swear I will dump my dvd collection once Peertube has this feature :)
This is related to this issue - and ROKU compatibility. If it were possible to adjust the FFMPEG output options like the Encoding plugin can do for input/transcode properties, we may be able to get there. If we can get this Multi Audio hurdle done, Peertube could really dominate some projects! https://github.com/Chocobozzz/PeerTube/issues/2625#issuecomment-1548080453
Or would the end goal be more motivated more by wanting to share video streams to save storage costs (and to improve seeding)? If the end goal is more about seeding, it would improve the swarm a lot if all languages were kept in the same file (and/or all of them would be downloaded together).
I hope decoder can selectively fetch needed audio stream only. Also adding new languages becomes problematic, since you need to pack new file(at least don't need to transcode).
Now comes the kludge: if a user selects a different audio track, the player would mute the main file and start playing the alternate audio track in the background, keeping the playback of the muted video and the separate audio in sync. Sounds brittle, but it might just work.
Sounds like average webdev hack. Just get current timestamp decoded and start playing another language from that timestamp. Assuming audio file has samplerate and integer offset from zero in samples it will be simple stopping decoding initial audio stream and starting decoding new audio stream from offset(keep in mind seeking sometimes can be expensive). If audio streams have integer samplerate and seconds-aligned offset, then wait until next second starts and replace tracks. Everyhing else is tricky, but also doable because pauses and seeking doesn't desync video and audio.
I’d like to add my support and bump this issue. Is there anything the average person can do to help?
This is the Roku problem that is also related - Fragmented MP4, CMAF (muxing audio and video not supported for CMAF).. https://developer.roku.com/docs/specs/media/streaming-specifications.md
Does the latest 6.3.0 update which separates video and audio allow multiple audio tracks on a single video? Asking as there's a group which does audio description for video who would like to try doing this on PeerTube, and multiple audio would allow this to happen.
(For those who don't know, audio description is where there's an alternate audio track where the on-screen action is described for people with visual impairments but the dialogue is left intact.)
(For those who don't know, audio description is where there's an alternate audio track where the on-screen action is described for people with visual impairments but the dialogue is left intact.)
This is even more important than localization. I didn't know it existed.
Here's an example of how the AblePlayer video player handles video descriptions: https://ableplayer.github.io/ableplayer/demos/desc1.html
Does the latest 6.3.0 update which separates video and audio allow multiple audio tracks on a single video?
Not yet, but it was a necessary step if we implement alternate audio tracks in the future
What if Peertube videos supported alternate audio tracks? Like, the way we have subtitles?
Or even better, audio track blending!
And also, the suggestion mechanism for audio tracks.
It would be very useful for dub translations into another languages. This feature is badly missing from YouTube, which results in a lot of unnecessary duplication of the same video with very varying video quality.
If we have this feature, we always get the exact same video, but we can choose an audio track we prefer, or even have it selected automatically according to browser locale. Great for watching documentaries and open movies.