Integrate new TIA sound core

sa666666 commented 7 years ago

I have an updated sound core, currently present in the tia-sound branch. This results in much more analog-like sound, fixes stereo mixing on mono systems, and in particular makes "E.T." sound much more like the real thing.

It is scanline-based, in that it is updated twice per scanline (this is what should happen according to the TIA schematics), but was too difficult to add to the old core (which was clock-based).

sa666666 commented 7 years ago

I don't know when we'll start this, but I want to get the ball rolling. I will describe my findings so far.

First, the AtariAge thread where it all started: http://atariage.com/forums/topic/249865-tia-sounding-off-in-the-digital-domain

Then a PDF from Crispy (the guy that developed the new sound hardware/algorithm): TIA Sounding Off in the Digital Domain.pdf

The attached ROM (tremolo) plays a continuous sound when pressing F1, and a square wave when pressing F2: tremolo.zip phaser06.zip

And finally my first pass at implementing this: TIASnd.zip

This code is basically a standalone TIASnd class implementation that can be run outside Stella. It only 'works' with two ROMs; tremolo.bin and phaser6.bin, attached above. It works by basically writing data to emulated sound hardware just like the ROM would; it should be obvious when you look at the code.

In the code, change the call to channels() as follows:

tiasnd.channels(2, false); for mono mixing (both channels mix and somewhat interfere with each other), as shown in Audacity:
tiasnd.channels(2, true); for stereo (F1 tone goes to left channel and F2 to right; no interference between the two), as shown in Audacity:

For Audacity, run the program to generate a 'test.raw' file. Then run Audicity and do as follows:

File -> Import -> Raw Data ...
select 'test.raw'
set Encoding to 'Signed 16-bit PCM' and Sample Rate to '31400'; the remaining entries should be fine as-is
click Import
play the sound

Finally, the sound files (AIFF format) of phaser ROM from Crispy from his hardware: Phaser06_aiff.zip

There are also other AIFF files and a wealth of information in the AtariAge thread.

thrust26 commented 7 years ago

The lookup table from the PDF document can be shortened if you add the two volumes instead of ORing them. Then the LUT is calculated for 0..30. The simplified code looks like this:

int i;
double r1;
double r2;
double raa;
double ra;
double rb;
double rc;
double rd;
unsigned short lut[31];
r1 = 1000.0
raa = 1.0 / 1875.0;
ra = 1.0 / 3750.0;
rb = 1.0 / 7500.0;
rc = 1.0 / 15000.0;
rd = 1.0 / 30000.0;
lut[0] = 0;
for (i = 1; i < 31; i++) {
  r2 = 0.0;
  if (i & 0x01)
    r2 += rd;
  if (i & 0x02)
    r2 += rc;
  if (i & 0x04)
    r2 += rb;
  if (i & 0x08)
    r2 += ra;
  if (i & 0x10)
    r2 += raa;
  r2 = 1.0 / r2;
  lut[i] = (unsigned short) (32768.0 * (1.0 - r2 / (r1 + r2)) + 0.5);
}

sa666666 commented 7 years ago

@thrust26, for suggestions on changing the code, can you reference the TIASnd files directly? Reason being that some of the code has slightly changed from what's in the PDF document. The document is more for describing at a high level what's going on; the code may already have changes/optimizations.

EDIT: The LUT is addressable by byte later in the code, so it needs size 256. I suppose we could add code that converts from 1-256 to 1-31, but to me the calculation would slow things down, and eliminate the purpose of the LUT entirely.

thrust26 commented 7 years ago

I didn't find such byte addressing in the TIAsnd code. Where should I look?

Attached my changes. TIASnd_Add.zip

sa666666 commented 7 years ago

In the various LUT to sound data generation code. One example is:

      uInt32 idx1 = 0;
      if(aud0_c1) idx1 |= myAudV0[0];
      if(aud1_c1) idx1 |= (myAudV1[0] << 4);
      uInt16 vol1 = myVolLUT[idx1][myHWVol];

Note that while myAudV0[0] may only ever contain 0x0f, the shift will make idx contain 0xff, and this is the index that will be used in the first dimension of the LUT; ie, we need 256 spaces.

thrust26 commented 7 years ago

I changed that too in my attached files.

sa666666 commented 7 years ago

OK, I will look at these changes later; I'm currently at work doing something else.

thrust26 commented 7 years ago

Understood.

BTW: I think for Hardware2Stereo, you have to use the lookup table too.

sa666666 commented 7 years ago

The lookup table is being used in all cases, otherwise we wouldn't get any sound :smiley:

The difference between Hardware2Mono and Hardware2Stereo is that in the former case, the channels are mixed together, and in the latter they're separated.

sa666666 commented 7 years ago

@thrust26, I just took a quick look at your files (not really for understanding, just for content). It seems you've reverted to the files in the repo. The files to use are in the post above (TIASnd.zip). Note that these files contain things that aren't in the repo; ie, they're the most recent version that we should be working from. Could you rebase your changes against those files?

thrust26 commented 7 years ago

Sure. TIASnd_Add2.zip

I somehow missed the lookup.

No clue why the same values are pushed twice for Hardware2Mono. Or why sometimes 4 values are pushed and sometimes only 2. But I don't need to know. 😄

sa666666 commented 7 years ago

It's 16-bit sound, so 2 bytes per sample. And for stereo there's two samples each time (per channel), so 4 bytes total.

sa666666 commented 7 years ago

@thrust26, confirmed that your new code does the same thing as the old, but by using a LUT less than half the size. So from this point on, we should base any further changes on TIASnd_Add2.zip.

EDIT: I can't do math, it seems. Your code improves the size of the LUT by reducing it to ~12% original size, much better than the 50% I said above.

DirtyHairy commented 7 years ago

I am interested in looking at integrating the new audio code after I am done with refactoring the FrameManager --- I'd like to model a new audio implementation in 6502.ts after it later. I yet have to sit down with paper and pencil and do the math, but the PDF is an interesting read :smirk:

As for the lookup table, you also have to consider the time savings of using a LUT vs. the potential of cache misses. If using the LUT causes a cache miss and forces an access to RAM, then doing the calculation of the fly may be much faster. If you consider this, then reducing LUT size to 30 bytes may be well worth the additional calculation.

thrust26 commented 7 years ago

I yet have to sit down with paper and pencil and do the math...

I did it with a spreadsheet. 😃

DirtyHairy commented 7 years ago

I did it with a spreadsheet. :smiley:

As a former theoretical physicist, I rejoice at the prospect of putting pencil and paper to work again :smile:

sa666666 commented 6 years ago

Closing this, since the new sound core is now integrated. All that's left is fine-tuning, and we have a new issue open for that: https://github.com/stella-emu/stella/issues/311.

stella-emu / stella

Integrate new TIA sound core #80