jsk-ros-pkg / jsk_common

common programs for jsk-ros-pkg
42 stars 81 forks source link

Add sample audio_recorder #1791

Closed Kanazawanaoaki closed 1 year ago

Kanazawanaoaki commented 1 year ago

If you want an audio file, capture_to_file.launch of audio_capture is effective. However, since the audio topic is not published by capture_to_file.launch, a program that saves the audio topic to a file is required to simultaneously publish the topic and save the real-time audio file.

The people at https://stackoverflow.com/questions/63145510/how-to-write-ros-audiodata-message-into-wav-file are also looking for it, and there doesn't seem to be one, so I wrote a simple sample.

If there is a demand and we need to do it properly, maybe we should understand gstreamer and split audio_video_recorder into audio only or video only (not many programs can make rostopic into a video with proper time?), or make it by bringing a part of jsk_rosbag_tools as a function.

knorth55 commented 1 year ago

audio_capture supports filesink, so we can directly save the audio data. this is what you need? what you want to do? can you describe in diagram using mermaid?

https://github.com/ros-drivers/audio_common/blob/836aa62522764ee7b8e4925b87ec7d84cfdd552c/audio_capture/src/audio_capture.cpp#L61-L74

mermaid: https://dev.classmethod.jp/articles/github-mermaid-markdown-cntrol/

Kanazawanaoaki commented 1 year ago

Thank you very much! As it turns out, what I needed was to use audio_play's filesink.

https://github.com/ros-drivers/audio_common/blob/836aa62522764ee7b8e4925b87ec7d84cfdd552c/audio_play/src/audio_play.cpp#L74-L79

roslaunch audio_play play.launch dst:=/tmp/output_audio_play.mp3 format:=wave

There are two situations where I think this feature is needed: one is when converting already published Robot's audio topic to a wav file and putting it into a Audio recognition model that loads the wav file, (like https://stackoverflow.com/questions/63145510/how-to-write-ros-audiodata-message-into-wav-file using respeaker ROS)

sequenceDiagram
    participant R as Robot
    participant A as Audio Recorder
    participant M as Audio Recognition Model

    R->>A: audio ros topic
    A->>M: audio wave file

and the other is when saving the rostopic from audio_capture as a log in rosbag, but also saving it to a wav file and putting it into the model.

sequenceDiagram
    participant C as audio capture
    participant A as Audio Recorder
    participant M as Audio Recognition Model
    participant R as rosbag

    C->>A: audio ros topic
    A->>M: audio wave file
    C->>R: audio ros topic
knorth55 commented 1 year ago

You mean like these? If so, it seems audio_play will solve the problem. but the function is a bit difficult to find. you can add the node directly to audio_common, if you want.

flowchart TD
  Robot --> |audio ros topic| audio_recorder
  audio_recorder -->|audio wave file| audio_recognition_model("Audio recognition model")
flowchart TD
  audio_capture --> |audio ros topic| audio_recorder
  audio_recorder -->|audio wave file| audio_recognition_model("Audio recognition model")
  audio_capture --> |audio ros topic| rosbag
Kanazawanaoaki commented 1 year ago

That's right. I don't know if this is necessary or the right solution, but I sent a PR to add record_to_file.launch to audio_common anyway. https://github.com/ros-drivers/audio_common/pull/238

The problem is solved, so I'm closing this PR.