godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.14k stars 93 forks source link

Expose AudioStreamPreview to GDscript #6466

Open KvaGram opened 1 year ago

KvaGram commented 1 year ago

Describe the project you are working on

I am working on a plugin adding voiceovers to text-based dialog system. One audio-file may contain audio for multiple dialog lines. As such my plugin includes an indexed list of start-times and stop-times for when the audio is to play for each line of dialog.

As well as an in-engine editor to define those start-times and stop-times. Herein lies the problem.

Describe the problem or limitation you are having in your project

I lack a good way to visualize where the start and stop times are. For this, I am working on a timeline. However, since I am unable to access the engine's audio waveform preview feature, I have to make my own (Failed at doing so, so far), or do without. Here is a crude drawing of the editor. The empty space is for further features I am still planning out.

voicedataUI

Up to now, the predecessor/prototype of this editor were dependent on either manual input with a lot of guesswork, or importing data from an external tool like Audacity. With the timeline implemented, edits would be much easier to visualize, making the plugin free from any dependency on external tools.

Do note that I am aware proposal #127 exists, and seems to ask for the same. I was advised to make my own proposal based on how old and imprecise that old proposal was.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Expose/export/map (imagine I used the correct terminology) the AudioStreamPreview class so it can be used by GDscript, and add a draw function to it, based on existing draw functions in the editor.

The AudioStreamPreview class is used and drawn in: https://github.com/godotengine/godot/blob/master/editor/import/audio_stream_import_settings.cpp AudioStreamImportSettings::_draw_preview() https://github.com/godotengine/godot/blob/master/editor/editor_resource_picker.cpp EditorAudioStreamPicker::_preview_draw() and more.

The fix would be to implement a function that draws a waveform to a Texture2D, in a way similar to the implementations above.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

#Given this example
var stream:AudioStream = load(res://audio/example.mp3)
var timeline_texture_node:TextureRect = %timeline_texture

# - In a primary use-case, user may wish to draw a simple waveform texture like the engine does, for use as a quick and easy preview.

# The preview data is first fetched
var audioPreview:AudioStreamPreview = stream.get_preview()
# Then user may generate a preview as a Texture2D of given size (width, height)
var waveformTexture:Texture2D = audioPreview.draw_waveform(600, 200)
# The texture may then be used with any compadible node, like a TextureRect or a Sprite2D.
timeline_texture_node.texture = waveformTexture

# - In a secundary usecase, the user may only want the previewdata
var audioPreview:AudioStreamPreview = stream.get_preview()
# then either make their own waveform graphic using their own implementation.
var waveformTexture:Texture2D = MySingletonUtility.draw_waveform(audioPreview, otherArg, otherArg2, etcArg)
# Or alternatively, the user may wish do something crazy, like using the preview's data as a base for terrain hightmap. (just a random idea as to one of the possebilities)

If this enhancement will not be used often, can it be worked around with a few lines of script?

No. For more on that, see below.

Is there a reason why this should be core and not an add-on in the asset library?

While I have learned it is possible to implement this myself using a editor module, the nature of the module system makes it impossible to use for a plugin. I cannot ask every user of the plugin to compile the engine themself just to have an engine compatible with my plugin. Other than that, the generation of the audio preview waveform data is too complex to re-implement in GDscript.

AThousandShips commented 1 year ago

See #6385

KvaGram commented 1 year ago

Well, that was quick, @AThousandShips . Still, do you believe this proposal should be treated the same?

AThousandShips commented 1 year ago

I'm not sure, I just wanted to point out the other case of this being discussed, and especially:

The editor's audio visualizer has a very specific goal – it's purely functional, with no artistic intent.

AThousandShips commented 1 year ago

For just the editor you can use EditorResourcePreview it is relatively straight forward from a quick glance, try it out and see how it works for you.

KvaGram commented 1 year ago

For just the editor you can use EditorResourcePreview it is relatively straight forward from a quick glance, try it out and see how it works for you.

Good suggestion, and I have attempted this.

However. My plugin builds on another plugin, and does not have direct access to an instance of a EditorPlugin which is required to access EditorResourcePreview, and the reference in said parent plugin's EditorPlugin seem to be glitched out of usefulness. This renders me unable to utilize that feature. Good suggestion, though. Only wish I could test it.

I have, however, finally found one workaround. It is clumsy and bulky, and would play sound aloud unless messy tricks with the audio busses is also applied.

Here is my current test-code. Where %player is a AudioStreamPlayer with a preloaded AudioStream and %timeline_texture is a TextureRect.

extends MarginContainer
#how many secunds per pixel
var testbus:int
var drawing:bool = false
var image:Image
var _lastT:float #last timestamp
var _secunds_per_pixel:float

# Called when the node enters the scene tree for the first time.
func _ready():
    draw_placeholder(%player.stream.get_length(), 1000)
    #drawing = true
    play(0)

# Called every frame. 'delta' is the elapsed time since the previous frame.
func _process(delta):
    if (%player.playing):
        var l:int = int( 100 * db_to_linear(AudioServer.get_bus_peak_volume_left_db(0,0)))
        var r:int = int( 100 * db_to_linear(AudioServer.get_bus_peak_volume_right_db(0,0)))
        var p = %player.get_playback_position()
        #print("%s - %s"%[p, p * _secunds_per_pixel])
        var length:int = max(1, (p - _lastT) * _secunds_per_pixel)
        var xpos:int = int(_lastT * _secunds_per_pixel)
        image.fill_rect(Rect2i(xpos, 0, length, 200), Color.BLACK)
        image.fill_rect(Rect2i(xpos, 100-l, length, l), Color.WHITE)
        image.fill_rect(Rect2i(xpos, 100, length, r), Color.WHITE)
        _lastT = p
        %timeline_texture.texture = ImageTexture.create_from_image(image)

func play(time:float):
    _lastT = time
    %player.play(time)

# draws a placeholder graphic for a length secunds of audio. one dot every 10 secunds
func draw_placeholder(time_length:float, pixel_width:int):
    #create image with 8-bit grayscale format
    _secunds_per_pixel = pixel_width / time_length
    print(_secunds_per_pixel)
    var stepsize = 1
    while stepsize * _secunds_per_pixel < 10:
        stepsize = stepsize * 60
    print("creating image sized %s"%[pixel_width])
    image = Image.create(int(pixel_width), 200, false, Image.FORMAT_L8)
    image.fill(Color.BLACK)
    var step:int = round(stepsize*_secunds_per_pixel)
    var p:int = step
    while p < pixel_width:
        image.fill_rect(Rect2i(p-5,95,10,10),Color.WHITE)
        p+=step
        %timeline_texture.texture = ImageTexture.create_from_image(image)
        #print("pixel = %s"%[p])

func _on_audio_stop():
    drawing = false
    %timeline_texture.texture = ImageTexture.create_from_image(image)

An image is created in draw_placeholder, then a placeholder texture is then drawn on that. Then in _process, when the audio is playing, we fetch the db-volume of left and right audio. We calculate the size and location for three rectangles to draw. Background (clears the placeholder), left and right audio. Setting the texture in _process seemed to have little to no impact on performance, but it is also set _on_audio_stop (connected to Finished signal).

This allows for rendering a preview waveform at runtime, and presumably also in-editor (given an added %tool annotation). This configuration is however inconvenient, as it requires the audio to actually play audibly for the user.

Some tricks with audio busses could mute that, but that would lock the user into including a possibly undesired setup.

AThousandShips commented 1 year ago

Using AudioStreamPlayback to generate a waveform is incredibly straight forward, managed to do so in a couple of lines of code, but the required functions are not exposed to GDScript, I'm not sure if there's a specific reason for that, if not exposing them would be the way to go here, essentially using the same method of extracting the waveform as AudioStreamPreview but allowing it in non-editor builds and fully customizable as well

I will look into exposing them and if it's something that's supported