Open ict-ryahata opened 3 years ago
I know this issue is on the older side but I am seeing the same behavior when I am using it. I am decoding an ogg using NVorbis and when I attempt to Encode the file it is missing some of the data from the beginning of the audio.
Sorry for the late reply (just started a new job).
I help maintain this project but unfortunately low level encoding logic isn't my specialty. When I get some free time I'll at least try to reproduce it, and then we'll see from there.
To make things easy here is a simple wrapper. It doesn't change the sample rate.
public class VorbisEncoder
{
private readonly int _sampleRate;
private readonly int _bits;
private readonly int _channels;
private readonly int _sampleSize;
private readonly int _sampleFrame;
private readonly Stream _outputStream;
private readonly OggStream _oggStream;
private readonly ProcessingState _processingState;
public VorbisEncoder(int sampleRate, int bits, int channels, Stream outputStream)
{
Guard.CheckNull(outputStream, nameof(outputStream));
if (sampleRate <= 0)
throw new ArgumentOutOfRangeException(nameof(sampleRate), "Sample rate should greater then zero");
if (bits != 8 && bits != 16)
throw new ArgumentOutOfRangeException(nameof(bits), "Expected bits range is 8 or 16");
if (channels <= 0)
throw new ArgumentOutOfRangeException(nameof(bits), "Channels should be 1 or more");
// Stores all the static vorbis bitstream settings
var info = VorbisInfo.InitVariableBitRate(channels, sampleRate, 0.5f);
// set up our packet->stream encoder
var serial = new Random().Next();
_oggStream = new OggStream(serial);
// =========================================================
// HEADER
// =========================================================
// Vorbis streams begin with three headers; the initial header (with
// most of the codec setup parameters) which is mandated by the Ogg
// bitstream spec. The second header holds any comment fields. The
// third header holds the bitstream codebook.
var comments = new Comments();
comments.AddTag("ARTIST", "TTS");
var infoPacket = HeaderPacketBuilder.BuildInfoPacket(info);
var commentsPacket = HeaderPacketBuilder.BuildCommentsPacket(comments);
var booksPacket = HeaderPacketBuilder.BuildBooksPacket(info);
_oggStream.PacketIn(infoPacket);
_oggStream.PacketIn(commentsPacket);
_oggStream.PacketIn(booksPacket);
// =========================================================
// BODY (Audio Data)
// =========================================================
_processingState = ProcessingState.Create(info);
_sampleRate = sampleRate;
_bits = bits;
_channels = channels;
_sampleSize = _bits / 8;
_sampleFrame = _sampleSize * _channels;
_outputStream = outputStream;
}
public void Encode(byte[] buffer, int index, int length)
{
int samples = length / _sampleFrame;
float[][] outSamples = new float[_channels][];
for (int ch = 0; ch < _channels; ch++)
outSamples[ch] = new float[samples];
if (_bits == 8)
{
for (int sampleNumber = 0; sampleNumber < samples; sampleNumber++)
{
int readIndex = index + sampleNumber * _sampleFrame;
for (int ch = 0; ch < _channels; ch++)
{
readIndex += ch * _sampleSize;
outSamples[ch][sampleNumber] = buffer[readIndex] / 128f;
}
}
}
else
{
for (int sampleNumber = 0; sampleNumber < samples; sampleNumber++)
{
int readIndex = index + sampleNumber * _sampleFrame;
for (int ch = 0; ch < _channels; ch++)
{
readIndex += ch * _sampleSize;
outSamples[ch][sampleNumber] = (short)(buffer[readIndex + 1] << 8 | buffer[readIndex]) / 32768f;
}
}
}
_processingState.WriteData(outSamples, samples, 0);
while (!_oggStream.Finished && _processingState.PacketOut(out OggPacket packet))
_oggStream.PacketIn(packet);
}
public void Flush(bool force = true)
{
while (_oggStream.PageOut(out OggPage page, force))
{
_outputStream.Write(page.Header, 0, page.Header.Length);
_outputStream.Write(page.Body, 0, page.Body.Length);
}
}
public async Task FlushAsync(bool force = true, CancellationToken cancellationToken = default)
{
while (_oggStream.PageOut(out OggPage page, force))
{
await _outputStream.WriteAsync(page.Header, 0, page.Header.Length);
await _outputStream.WriteAsync(page.Body, 0, page.Body.Length);
}
}
Usually in those cases - you are missing a flush somewhere probably before the end
Are you sure those missing bytes are beginning bytes and not ending bytes?
The top shows the source file (FLAC), the middle shows the conversion by libvorbis, and the bottom shows the conversion result by .NET-Ogg-Vorbis-Encoder. Only the .NET-Ogg-Vorbis-Encoder output was shifted forward by 23 milliseconds.
At the end, it finishes 23 milliseconds early. I believe that data at the beginning is being lost.
I'm currently trying to use this library to encode raw PCM audio.
When comparing the encoded ogg file to the uncompressed wav file I noticed that they were not the same duration (not size); the ogg file's duration is shorter. The sample rate of the ogg file and wav file are the same.
At first I was worried that I was not using the library correctly even though I pretty much mirrored the example script linked on the github main page for this repository. However, I called the code from that example that generates a sound file with a sine wave of a specified amplitude and noticed that the encoded ogg files do not match the duration I passed in (supposed to be 3 seconds but the actual encoded file is approximately 2.972 seconds missing 1228 samples).
Is this expected behavior?