[Proposal] Generate mapping file of slices and original audio files

Nukepayload2 commented 2 months ago

Summary

As a video editor, I want to slice videos based on the audio part. I need to remove silence parts from my live playbacks. If this tool can produce a file that maps sliced audios to time spans of the original audio, I can use the mapping file to slice my videos to remove silence parts automatically with video processing tools (like ffmpeg). I can also use speech to text tools to filter audio slices, then filter spans of my videos based on the mapping file.

File format

The following file maps time spans of 2 original audios to 5 audio slices. The output path of the mapping file needs to be specified from GUI before slicing.

{
  "outputFolder": "C:\\Users\\UserName\\Videos\\PlaybackSlices",
  "tasks": [
    {
      "originalFile": "C:\\Users\\UserName\\Videos\\Playback202405030234.wav",
      "slices": [
        {
          "start": 0,
          "end": 1780,
          "file": "Playback202405030234_0.wav"
        },
        {
          "start": 1780,
          "end": 2460,
          "file": "Playback202405030234_1.wav"
        }
      ]
    },
    {
      "originalFile": "C:\\Users\\UserName\\Videos\\Playback202405040330.wav",
      "slices": [
        {
          "start": 0,
          "end": 2790,
          "file": "Playback202405040330_0.wav"
        },
        {
          "start": 3150,
          "end": 9460,
          "file": "Playback202405040330_1.wav"
        },
        {
          "start": 12800,
          "end": 14690,
          "file": "Playback202405040330_2.wav"
        }
      ]
    }
  ]
}

Note

I'm not sure whether I can implement this feature by myself. Because I'm new to Python. If I managed to implement this feature, I'll open a pull request.

Nukepayload2 commented 2 months ago

I've updated the UI and finished the JSON exporting part on my fork. But the time unit of spans are not in milliseconds. The time unit is actually seconds multipled by the sample rate. Once the unit conversion is done, I'll create a pull request.

flutydeer commented 2 months ago

If you want to obtain time stamps, see https://github.com/flutydeer/audio-slicer/discussions/18#discussioncomment-7452464

Nukepayload2 commented 2 months ago

@flutydeer Thanks for letting me know OpenVPI's dataset-tools. However, that tool uses audio frames instead of milliseconds as time unit, which is unfriendly for ffmpeg. I've finished the time unit conversion part on my fork. The output JSON uses milliseconds. If you're interested in the time stamps feature, I can create a pull request.

flutydeer / audio-slicer