Makememo / MemoAI

MemoAI Video to translated text, subtitles and notes made easy.
https://memo.ac
474 stars 5 forks source link

Do an audio mixdown before extracting audio #287

Open liechtjc opened 3 months ago

liechtjc commented 3 months ago

We have H.264, .mp4 wrapped files with multi track audio. It would be really nice to be able to use memo on these files directly A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Apparently Memo is using the first 2 track or the first pair of audio.

It could bew nice to have a checkbox in the settings to choose to perform a mixdown of all audio tracks for S2T. More complex setup would be to allow the selection of the tracks to be used for S2T. (but this depends of trhe media)

An ideal workflow would be to be able to listen at individual audio track in Memo and select the track we would like to consider for S2T before triggering the process.

Describe alternatives you've considered Doing a mixdown outside of memo

Additional context Type of media uploaded:

`General Complete name : /Volumes/storage3/60J-SHOOTING/TRANSLATE_EXPORT/230818_day37/C013C276_200315WV_CANON.mp4 Format : MPEG-4 Format profile : Base Media Codec ID : isom (isom/iso2/avc1/mp41) File size : 33.3 MiB Duration : 1 min 27 s Overall bit rate : 3 175 kb/s Frame rate : 25.000 FPS Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC Writing application : Blackmagic Design DaVinci Resolve Studio

Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : High@L5.1 Format settings : CABAC / 2 Ref Frames Format settings, CABAC : Yes Format settings, Reference frames : 2 frames Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 1 min 27 s Bit rate : 1 370 kb/s Width : 3 840 pixels Height : 2 160 pixels Display aspect ratio : 16:9 Frame rate mode : Constant Frame rate : 25.000 FPS Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Progressive Bits/(Pixel*Frame) : 0.007 Stream size : 14.4 MiB (43%) Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC Color range : Limited Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 Codec configuration box : avcC

Audio #1 ID : 2 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 1 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Audio #2 ID : 3 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 2 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Audio #3 ID : 4 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 3 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Audio #4 ID : 5 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 4 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Audio #5 ID : 6 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 5 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Audio #6 ID : 7 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 6 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Audio #7 ID : 8 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 1 min 27 s Bit rate mode : Constant Bit rate : 256 kb/s Channel(s) : 1 channel Channel layout : M Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Stream size : 2.68 MiB (8%) Default : Yes Alternate group : 7 Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC

Other ID : 9 Type : Time code Format : QuickTime TC Duration : 1 min 27 s Frame rate : 25.000 FPS Time code of first frame : 11:49:54:00 Time code of last frame : 11:51:21:20 Time code, stripped : Yes Language : English Encoded date : 2024-06-04 07:57:32 UTC Tagged date : 2024-06-04 07:57:32 UTC `