Closed ababo closed 5 months ago
The first threee packets of a vorbis stream are the header packets. Probably that extra data is in those. lewton doesn't expose all of this data in its public interface (and I don't think it should).
Doesn't work that way, unfortunately:
use crate::server::{middleware::Auth, Server};
use axum::{
extract::{
ws::{Message, WebSocket},
State, WebSocketUpgrade,
},
http::{header::CONTENT_TYPE, HeaderMap, HeaderValue, StatusCode},
response::{IntoResponse, Response},
};
use futures::{StreamExt, TryStreamExt};
use log::debug;
use ogg::reading::async_api::PacketReader;
use std::{
io::{Error as IoError, ErrorKind as IoErrorKind},
sync::Arc,
};
use symphonia::{
core::{
codecs::{CodecParameters, Decoder, DecoderOptions, CODEC_TYPE_VORBIS},
formats::Packet as SymphoniaPacket,
},
default::codecs::VorbisDecoder,
};
const VORBIS_CONTENT_TYPE: &str = "audio/ogg; codecs=vorbis";
/// Handle transcribe requests.
pub async fn handle_transcribe(
State(server): State<Arc<Server>>,
_auth: Auth,
headers: HeaderMap,
ws: WebSocketUpgrade,
) -> Response {
debug!("received transcribe request");
if headers.get(CONTENT_TYPE) != Some(&HeaderValue::from_static(VORBIS_CONTENT_TYPE)) {
debug!("rejected to transcribe due to unsupported content type");
return (StatusCode::BAD_REQUEST, "unsupported content type").into_response();
}
ws.on_upgrade(move |s| ws_callback(server, s))
}
async fn ws_callback(_server: Arc<Server>, socket: WebSocket) {
let (mut _sender, receiver) = socket.split();
let data_reader = Box::pin(receiver.into_stream().filter_map(|msg| async {
match msg {
Ok(Message::Binary(data)) => Some(Ok(data)),
Ok(_) => None,
Err(err) => Some(Err(IoError::new(IoErrorKind::Other, err))),
}
}))
.into_async_read();
let mut packet_reader = PacketReader::new_compat(data_reader);
let mut extra_data = Vec::new();
let mut decoder = None;
let mut packet_index = 0;
while let Some(Ok(mut packet)) = packet_reader.next().await {
match packet_index {
0..=1 => extra_data.append(&mut packet.data),
2 => {
extra_data.append(&mut packet.data);
let mut codec_params = CodecParameters::new();
codec_params.for_codec(CODEC_TYPE_VORBIS);
codec_params.with_extra_data(extra_data.clone().into_boxed_slice());
let decoder_opts = DecoderOptions::default();
decoder = match VorbisDecoder::try_new(&codec_params, &decoder_opts) {
Ok(decoder) => Some(decoder),
Err(err) => {
debug!("failed to create vorbis decoder: {err:#}");
return;
}
};
}
_ => {
let packet =
SymphoniaPacket::new_from_boxed_slice(0, 0, 0, packet.data.into_boxed_slice());
let buf = match decoder.as_mut().unwrap().decode(&packet) {
Ok(buf) => buf,
Err(err) => {
debug!("failed to decode packet: {err:#}");
return;
}
};
debug!("num decoded samples {}", buf.frames());
}
}
packet_index += 1;
}
}
I get:
failed to create vorbis decoder: malformed stream: vorbis: invalid packet type for setup header
After comparing extra data from a decoder that was created by symphonia
I see that it's smaller than first three packets concatenated (the diff shows more than 100 bytes removed).
If I replace the extra data with the reduced version then it decodes subsequent packets properly. So it looks like I only need to extract proper portions from the first 3 frames.
I think you need to decode the first and third headers and leave the second header out (comment header). Ultimately, it's a documentation question of the symphonia crate though so I'm closing this.
Thanks!
Could you point to an example?
It looks like the vorbis decoder needs extra metadata that
PacketReader
doesn't provide. For instance when I try to instantiateVorbisDecoder
fromsymphonia
I'm required to set extra data in the correspondingCodecParameters
. And currently I don't see any way to reuselewton
to decode packets fromPacketReader
.