API is based around audio returning float, but tremor is int32 based

AprilAndFriends / theoraplayer

A multi-threaded C++ library that plays video files supporting multiple codecs across platforms. Easy to use, fast, responsive, abstract interface and minimal dependencies, you'll soon be wondering how you lived without it! ;) Audio and Video interfaces are completely abstracted so the library can be used anywhere, regardless of what you use to display video frames and play audio samples (eg. OpenGL / OpenAL, Direct3D / DirectSound?, SDL / SDL_mixer, X11 / alsa ...) The library can pre-cache video frames and decoded audio samples for maximum efficiency and smooth playback, even on single-cpu systems. Currently, the library supports Theora on Windows, Mac, iOS, Linux, Android, WinRT and Windows Phone. H.264 is supported on Mac and iOS.

BSD 3-Clause "New" or "Revised" License

84 stars 43 forks source link

API is based around audio returning float, but tremor is int32 based #20

Closed blloyd75 closed 7 years ago

blloyd75 commented 7 years ago

The current code for decoding audio appears to need floats returned. If the tremor library is used, an int32 based array is returned instead, and those ints get used as floats, causing all sorts of issues. Is there an alternate implementation of decode sound that is supposed to be used with tremor or is this a bug in the system?

blloyd75 commented 7 years ago

I was considering getting vorbis to compile for iOS as an option. If tremor is used, we would need to do a version of the player audio decoder that uses the int32 based buffer instead of the float based buffer used now. I can do either or both options (and need to for a project I am immediately working on), but want to make sure I am not missing something before making this change. I am thinking a define THEORAPLAYER_USE_TREMOR would be a good way to toggle between the two, as they are not drop in identicial.

borisblizzard commented 7 years ago

How exactly does the problem manifest in your code? We've been using tremor on iOS and Android for years and we never experienced any issues. Tremor is a fixed-point decimal implementation of Vorbis so it can be used on platforms that might not have hardware-accelerated floating point support.

blloyd75 commented 7 years ago

I'm hoping I'm missing something obvious.

In VideoClip_Theora.cpp, it always uses the Vorbis header. float VideoClip_Theora::_decodeAudio() uses vorbis_synthesis_pcmout, which for vorbis uses float, but tremor uses int32. Later that array of int32's are passed as a float array into the addAudioPacket function which multiplies (as a float) each value by the gain (a float). It then queues (as an array of floats) the result of that for eventual pushing into the audio interface.

Basically, I get very very corrupted sound.

borisblizzard commented 7 years ago

Is it possible that the fixed-point ints don't get properly translatd into floating-point ints? Is the audio file you are using internally saved in ints or floats?

borisblizzard commented 7 years ago

Ok, I see now where the problem lies. We're actually using a custom audio interface implementation that doesn't use vorbis_synthesis_pcmout() so we don't have that issue. We'll check it out. We'll likely have to change the implementation.

blloyd75 commented 7 years ago

The file I am using is one that is used in a cross platform project. It decodes properly when using the Vorbis decoder on PC and OSX. Are you saying that the audio track format is stored differently when using Tremor? Vorbis returns floats > -1.0 and < 1.0. I don't know what tremor uses for it's PCM data, but I can't see an int in a float working when the other reference implementation is almost never an integer.

blloyd75 commented 7 years ago

To be clear, due to the header inclusion, How does VideoClip_Theora::_decodeAudio() manage the effective float pcmval = reinterpret_cast<float>(pcmfromtremor[x]); For each of the elements in the pcm array?

borisblizzard commented 7 years ago

Not stored, but it's decoded differently. As I said previously, fixed-point ints are used in tremor instead of floats. You noticed right that vorbis_synthesis_pcmout() uses float in vorbis and int32 in tremor. So technically the output is being used in a wrong way and is getting different data when tremor is used. Which is why you experience the data corruption.

blloyd75 commented 7 years ago

That much I had figured out. You mentioned you had been using it successfully. I assume you meant in theoraplayer. Is the data in a format that can be converted from without effort or is the usage in theora player going to be a problem that needs fixed?

borisblizzard commented 7 years ago

I actually made a mistake, we're using a custom audio interface that doesn't use vorbis_synthesis_pcmout(). :) This is why it works for us. I'm not sure if there is a quick fix for vorbis_synthesis_pcmout() or if the implementation will have to be changed completely. We have to look into this first. If we can make a quick fix, we'll prefer it.

kspes commented 7 years ago

We're using separate audio files instead of embedded audio tracks in a theora OGG file, that's why we never stumbled upon this.

I always thought vorbis and tremor's header files are the same, just implementation is different.

I'd rather do a dynamic check instead of the preprocessor variant, to reduce the amount of targets in project files. Eg to check the version string from vorbis and use that to determine the output data.

then _decodeAudio would have to convert integers to floats to be compatible with the audio interface API, but that shouldn't be too tough to implement.

blloyd75 commented 7 years ago

With theoraplayer as it exists now, it assumes tremor is used for iOS (it can't build vorbis for iOS). Instead of supporting tremor or vorbis, why not either detect iOS and use tremor instead of vorbis everywhere or else add a define (defined in the iOS project) and use tremor based on that?

kspes commented 7 years ago

some developers want to use vorbis. I remember one team had linker issues with tremor, not sure why. best to be as less restrictive as possible.

blloyd75 commented 7 years ago

The desire to want vorbis makes sense. Right now, the project has some defines like _USE_THEORA. (Which btw, is a reserved name for use by the compiler/standards implementation, leading underscore followed by a capital letter). Couldn't we add a THEORAPLAYER_USE_TREMOR and when defined, do the tremor logic? This has the advantage of including the header for the correct version of the library. We can document how to undefine the constant, and when not defined it would fall back to using vorbis like now.

I believe the iOS project is the one that currently only compiles tremor (vorbis for iOS not included in project at all).

For the want to support all. It would make sense to me to add a vorbis based version of iOS Dependencies since you express an interest of wanting to support with or without vorbis.

I'm more than willing to do this update and commit a branch for review.

kspes commented 7 years ago

Which btw, is a reserved name for use by the compiler/standards implementation, leading underscore followed by a capital letter What, really? we use this logic everywere, @borisblizzard wha do you think?

Yeah, maybe doing both versions - automatic detection and #define for forcing. Using vorbis headers but linking tremor is definetly not the best practice, but it works.

So yeah, if you'd be willing to implement this change, that'd be great :)

borisblizzard commented 7 years ago

I've seen both variants used, with and without a leading underscore. I prefer with underscores, because it makes a distinction between normal macros within files and global project preprocessors. If we wanted to change it now, we would have to change it in a lot of places / projects across dozens of libraries.

As far as vorbis_synthesis_pcmout() goes, we could edit our tremor copy to output floats instead of int32. Since we're using it only for performance and compatibility reasons (not because the target platforms don't support floating-point values), this shouldn't be a problem. It's not the most graceful solution, but it might be the most convenient since the output would be guaranteed to be the same in both vorbis and tremor.

kspes commented 7 years ago

true, but some people prefer using their own versions of tremor, theora etc. we shouldn't modify OSS code for such purposes

blloyd75 commented 7 years ago

I'll link to a non-definitive source for the underscore rules. It's hard to link to hardcopy.

http://stackoverflow.com/questions/228783/what-are-the-rules-about-using-an-underscore-in-a-c-identifier

blloyd75 commented 7 years ago

@kspes, I am not sure I understand you comment. -- I now understand. For whatever reason, I did not have Boris's response that you were responding to in my discussion list. The #define would be to control theoraplayer, which I believe is exclusive to this project. This project already modifies vorbis and tremor libraries. Adding the ability to compile the vorbis for iOS would not be any more variation than it already has from the baseline vorbis, and would be a convenience for those wanting to use vorbis just for theoraplayer. Making it easy for theoraplayer to compile to use either tremor or vorbis or even a drop in api identical replacement to one or the other later makes sense by setting. Picking an API to build against via a #define makes sense. There are differences in the public interface between tremor and vorbis. It's really not a good idea to guess which one, no matter how good the guess is.

I suspect people that have actually tried to use tremor with the built-in audio interface ran into the problem I found. Modern iOS actually does have some hardware floating point, so supporting float output can be fast. Tremor is faster than vorbis (due to things other than floating point, tremor is the more performant implementation), and so it's probably worth it to continue using tremor and just doing a conversion to float after so the API stays consistent.

FYI, iOS has a floating point multiply but no hardware divide, go figure.

kspes commented 7 years ago

Hmm, I suppose we can go with the preprocessor option, and have iOS and Android targets default to tremor, if someone wants to change this, he can simply adjust preprocessor definitions and configure the code for their own needs.

let's go with this. @borisblizzard do you agree?

borisblizzard commented 7 years ago

Agreed.

blloyd75 commented 7 years ago

a branch use_tremor is now pushed. I pushed a branch instead of to master because I was unable to test Android. If Boris wants to do so, that would be appreciated.

borisblizzard commented 7 years ago

Merged the branch, refactored code, checked Android. Everything seems fine. :)