cogentcore / core

A free and open source framework for building powerful, fast, elegant 2D and 3D apps that run on macOS, Windows, Linux, iOS, Android, and the web with a single Go codebase, allowing you to Code Once, Run Everywhere.
http://cogentcore.org/core
BSD 3-Clause "New" or "Revised" License
1.76k stars 83 forks source link

Add multimedia support (audio and video playback and capture) #516

Open kkoreilly opened 1 year ago

kkoreilly commented 1 year ago

There are various libraries that support some of this functionality on some platforms (like faiface/beep), but a unified wrapper that works on all platforms for the aforementioned functionalities would be extremely helpful and enable the creation of more advanced apps, as mentioned in #464.

kkoreilly commented 1 year ago

For audio, hajimehoshi/oto probably makes the most sense, as it has good cross-platform support and is maintained. For video, using a wrapper on ffmpeg (like u2takey/ffmpeg-go) might be the best option. Also, there is a Vulkan Video API, but it isn't supported on macOS and iOS. Finally, we can also support GIFs easily through the builtin image/gif package.

c1ngular commented 1 year ago

How about gstreamer ? glimagesink maybe ?

rohrlich commented 1 year ago

I used hajimehoshi/oto in gospeech and found the author very responsive to resolving issues. The audio playback code in gospeech is out of date now I expect but I think the library is a good choice. I used ffmpeg for a lot of audio processing and it is what is used in Audacity. I found it to work well.

kkoreilly commented 11 months ago

The development of this has moved to https://github.com/goki/video, but I will leave this issue here as a v2 milestone item.

rcoreilly commented 8 months ago

direct rendering video working well now. in terms of basic playback, we just need a version of vgpu.Drawer Scale that also does simple rotations, and need to detect rotations in videos (or at least have manual setting).

It seems very fast and doesn't register much on my CPU meter on the mac -- should check on other platforms.

rcoreilly commented 8 months ago

also need some way of stopping the audio playback if we stop the video.

rcoreilly commented 4 months ago

For those who might know about these different options: the key thing we need is the ability to decode the video directly into something we can integrate with the rest of the gui elements. running an external player is not going to work. It looks like several of the available options are of that sort, in particular I looked at https://github.com/adrg/libvlc-go for libvlc and it seems to just be pulling up a player.

Several other options we looked at previously were of this form: calling executables via command line and scraping video frames from the output of the command is not a viable solution.

The current lib we're using at least has the direct api in Go wrapped around libc to get the raw video frames and audio so we can use them.

Vulkan these days has a direct hardware decoding API which would be great but isn't supported in our existing Go vulkan wrapper. We are also considering WebGPU: might it have a hardware video decoding component?

c1ngular commented 4 months ago

@kkoreilly @rcoreilly

as far as i know :

libmpv : it has integratabtle APIs

libvlc: it seems offer integratabtle APIs in 4.0 , but it's been delayed a long time , 3.0 is what we got so far .

ffmpeg: go-astiav is the most complete and maintained golang binding we got so far , i believe this is the place to start if we are going ffmpeg/libav way .

P.S. there are many compatibility problems to consider across different platforms if we want start from scratch , but great challenge come with great achievement and flexibility i guess .

EDIT

here's my two cents:

if we are going fine-grained controllable ffmpeg/libav way , i would suggest making composable blocks in addition to a compact video widget :

  1. mediaInput package
  2. decoder widget/package
  3. encoder widget/package
  4. render/speaker widget/package (using filter/scale)
  5. mediaOutput package
  6. filter package
  7. scale package

with these configurable blocks as a composable pipeline for multimedia support , would introduce maximum compatibility/flexibility/performance not only to the video widget , but for other use cases such as you mentioned in yesterday's blog : a video editor for instance .

pipeline1: NewMediaInput(path , options)=>NewMediaDecoder(options,callbacks)=>NewMediaRender(options,events)=> ...

pipeline2: images/pcm data =>NewFramebuffer(options)=>NewMediaEncoder(options,callbacks)=>NewMediaOutput(path , options)

based on go-astiav and what you already have done this project , it won't be too much hassle to make a proof-of-concept experiment .

gedw99 commented 4 months ago

https://github.com/zergon321/reisen

It might help . Gio video works well with it

I do t know why it’s discontinued but it still works well as a player and frame extractor

rcoreilly commented 4 months ago

reisen is the one we're currently using. go-astiav looks like a good option. @c1ngular that proposal looks reasonable to me on first glance.

c1ngular commented 4 months ago

I have done quite some study and experiments on this rabbit hole (golang gui and multimedia) some time , there is no easy way to make it happen or good, especially for a cross platform gui lib .

i have tried gstreamer, vlc, libmpv, libav golang bindings , IMHO ,none existing i would call it seriously GUI / general purpose/ cross platform oriented, most are just for cli app like using ffmpeg the executable in golang fashion.

c1ngular commented 3 weeks ago

after migration to wgpu backend , is there still performant way(i.e. opengl render) to integrate 3rd party video player into cogentcore ? or software render is the only way ? (i was looking into libmpv)

kkoreilly commented 3 weeks ago

@c1ngular Yes, there is a WebGPU video texture feature that we should be able to use to implement hardware accelerated video rendering at some point relatively soon.

gedw99 commented 3 weeks ago

Wow that would be an amazing addition .

Opens up the ability to do convolution networks , and 3D multi sensor fusion

suspect a transcode will be needed server side . What encoding are you shooting for ?

c1ngular commented 3 weeks ago

@kkoreilly great , thanks

oderwat commented 1 week ago

I would need audio (just .wav actually) playback for Darwin and Windows. Is that already possible? Could we get an example of how that works? I am using beep already, but I have no idea how and if that will work with Cogentcore for different platforms and in the browser.

kkoreilly commented 1 week ago

@oderwat The underlying package behind beep says that it works on all of the platforms we support, so you should be able to play audio files using beep in the same way you are already. If you need a GUI player for audio with controls, we have not implemented that yet but can soon; please let us know and we can accelerate development of that. If you just need to play audio programmatically, there should be no issues just playing it with beep (just make sure that you embed the audio files using //go:embed and open them through that to ensure they work on mobile and web).

oderwat commented 1 week ago

@kkoreilly I am going to test that.

I would need it for Windows and Mac executables with native compiled Go. For web and mobile we use other frameworks very successfully (see below).

The audio in this case comes from an HTTP API (TTS Model) and the "player" control is a mixture of automatic and user control. Right now I have a Fyne application, but I can't get it to run on Windows right now. Which is most likely a stupid error of me though.

In other projects I/we "oto" for audio in WASM (Go-App) with the browser and with CLI on Windows and Darwin.

WASM gets big and with packages like "http" (client) it gets even larger. This is way we use quite some bit of special implementations with our WASM targets instead. There are additional problems like CORS with HTTP endpoints in the browser. This is why we use a single NATS connection and build all server/client communication on this if possible (https://github.com/oderwat/go-nats-app). We are not planning to use CogentCore for the Browser, but this may change with time.

gedw99 commented 1 week ago

That’s interesting in relation to golang and WASM running in the browser , cause you can use the browsers own http engine instead to avoid bloat of WASM size.

That’s what Golang WASM in Cloudflare does too in my case . See : https://github.com/syumai/workers

needs compiler tags then ?

oderwat commented 1 week ago

@gedw99 You need compiler tags for any WASM

gedw99 commented 1 week ago

@gedw99 You need compiler tags for any WASM

I’m talking about conditional compilation which in golang is called build tags..

The code for http needs to be under a build tag , Same as for fs api and basically anything else that uses the browser , instead of the OS.

oderwat commented 1 week ago

@gedw99 I don't know what to answer. Thank you for trying to help.