skjelten / emusc

A software synthesizer emulating the Sound Canvas SC-55 lineup
GNU General Public License v3.0
205 stars 15 forks source link

RIAA filter #39

Closed giulioz closed 5 months ago

giulioz commented 7 months ago

Is it confirmed that the SC55 uses two passes of a RIAA filter to process the samples? By looking at the PCM rom I don't think it's the case. The PCM data is likely to be some form of DPCM, which if interpreted wrongly could produce a low-freq cutting effect.

giulioz commented 7 months ago

On top it's how the rom is interpreted as DPCM, while on the bottom as normal PCM (as currently done in the code)

image

PCM: https://github.com/skjelten/emusc/assets/1353142/99236c76-ad67-43ee-bb42-5cf0599fa6a2

DPCM: https://github.com/skjelten/emusc/assets/1353142/da5424eb-5739-4e26-a798-16abcd731448

Here's how I did it:

int32_t currentValue = 0;
for (size_t i = 0; i < samplesCount; i++) {
  size_t address = 0x8000 + i;
  int8_t data_byte = outbuf[address];
  uint8_t shift_byte = outbuf[((address & 0xFFFFF) >> 5) | (address & 0xF00000)];
  uint8_t shift_nibble = (address & 0x10) ? (shift_byte >> 4) : (shift_byte & 0x0F);
  int32_t final = ((data_byte << shift_nibble) << 14); // Shift nibbles thus far never exceed 10, thus 18 bit samples

  currentValue += final;
  audioOutBuf[i] = currentValue;

  // old version
  // audioOutBuf[i] = final;
}
Kitrinx commented 7 months ago

The certainty on the RIAA filter being correct is low, it was just the closest thing we happened to find back when we were making usable sounds out of some of the early reverse engineering.

giulioz commented 7 months ago

Got it thanks! Do you think it makes sense for me to submit a PR with this change?

skjelten commented 7 months ago

Questions around the uncertainty of the 2x RIAA filter solution is something that comes up from time to time. See for instance mattw's comment and Tiido's response here.

That aside, the only reason why not more effort has been put into analyzing this issue so far is that there has been so much else to look into. If you have suggestions for improved decoding of the samples that would be really great news! Please send a PR and we'll have a look.

skjelten commented 6 months ago

I finally had some time to look at your code snippet and I think you are absolutely right about the samples being in DPCM format. It is a bit of a mystery to me how adding samples cumulative in a partial has almost the exact same net result as running 2x RIAA filters on the same sample set, but that is how it is.

In summary, it seems like using DPCM decoding instead of RIAA filtering makes the code simpler, uses less CPU, and gives slightly better audio quality.

If you are working on a PR I will wait for you to complete, if not I will just finish and commit what I have.

giulioz commented 6 months ago

Hi! Unfortunately had no time this week to make the PR, so please don't wait for me and feel free to commit the fix! My theory on why it sounds similar is because by not accumulating the samples you end up doing something very similar to what an high-pass filter would do (subtracting the previous samples with a gain), which you kinda revert by boosting the bass freqs using the RIAA

Grieferus commented 6 months ago

My only issue with this is that some samples seem to be out of center, so to say.

giulioz commented 6 months ago

@Grieferus you mean DC offset, something like this? image

Grieferus commented 6 months ago

Precisely.

skjelten commented 6 months ago

I am not familiar with any DC offset in the current code (but I have not looked for it either). The two reasons for samples being perceived as a bit of center that I can think of are:

@giulioz Is that waveform in your screenshot taken from emusc?

I can also be mentioned that there currently is no highpass filter used in emusc, but this has been on my todo list for a long time to try out. It might be that adding a static highpass filter for all samples could reduce any DC offset being present.

giulioz commented 6 months ago

@skjelten no, just a random screenshot from the internet.

I know that the DC offset CAN happen, since the samples are DPCM, but honestly no idea if that's the case with emusc, my macos doesn't like the WAV output option so I can't properly check :(

Being DPCM the samples are stored as derivatives instead of absolute values, and integrating them over time can lead to errors that eventually create DC offset. IIRC this should not happen with Roland samples, so maybe it's some error in the code? (no idea where though, maybe an off by one)

skjelten commented 6 months ago

@giulioz could you check my latest fix pushed to git for properly setting the WAV file path in the Preferences Dialog and see if it works on your machine now? I know the source code for the WAV writer is a bit crude, but I was hoping it would work for the latest versions of macOS. The recording starts when you power on the emulator, and stops when you power it off (press space).

When moving over to DPCM I did stumble upon one issue that gave me some trouble: how to add up sample values when you have looping samples? So there are basically 3 kinds of sample playback (loop modes): 0: Play forward, jump back X bytes and repeat. 1: Play forward, at the end start playing backwards X bytes and repeat 2: No loop, play once from start to end

The easiest is loop mode 2 (no loop) where you just add the samples from start to end. In loop mode 0 I expect the right way to do it is basically the same as for loop mode 2, but every time you jump back in the sample list for looping you have to start summation from sample 0 again (or store the correct summation state in the sample your return to). The tricky one is loop mode 1. If you keep summation while going forward and backwards many times the volume will increase to "infinity" for a number of instruments. The way it is implemented now is to do summations while stepping forward, then keep on while going backwards, and then reset the summation going forward again. This creates annoying "ticks" in the sound for certain instruments every time they loop (sometimes the tick comes both when it starts going backwards and when it starts going forward again). Interestingly enough, even the real synth has some audible ticks for certain instruments, but much less than emusc. I suspect the linear resampling might be part of the reasons for the very audible ticks as they change characteristics for different note numbers (pitch). The best example I can find of this issue is instrument 55 SynVox.

Any suggestions or ideas on how to avoid or reduce the ticks for instruments using loop mode 1 would be greatly appreciated.

We did by the way have the exact same problem with the RIAA filter, so the problem is not new.

mmontag commented 6 months ago

Looking at @giulioz first screenshot in the thread, it seems that the ROM cannot be interpreted as DPCM all at once. There might be some padding between samples that leads to big drift in DC center value, from start to finish... This needs to be reset to 0 for each sample, no?

@skjelten if I understand correctly, loop mode 1 is "ping pong". What if you flip the DPCM sign when you go backwards?

for the following example DPCM stream:

 -  -  +  +  +  -  -  -
0 -1 -2 -1  0  1  0 -1 -2

time-reversed would be:

 -  -  -  +  +  +  -  -

and polarity flipped:

 +  +  +  -  -  -  +  +

finally, concatenated to the original gets us back to 0 with no drift:

 -  -  +  +  +  -  -  -  +  +  +  -  -  -  +  +
0 -1 -2 -1  0  1  0 -1 -2 -1  0  1  0 -1 -2 -1  0

Obviously, this can still produce discontinuities/pops at the junction, depending on how the loop boundaries are chosen. And you might still need other mitigations.

skjelten commented 6 months ago

Looking at @giulioz first screenshot in the thread, it seems that the ROM cannot be interpreted as DPCM all at once.

Yes, you are right. I noticed this the first time, but forgot about it later. Would be interesting to check which instruments that has this DC offset, and if there is any correlation with loop mode.

@skjelten if I understand correctly, loop mode 1 is "ping pong". What if you flip the DPCM sign when you go backwards?

Yes, "ping pong" seems like a good name for loop mode 1. I am not sure if I fully understand your suggestion, but what I did try to do when I worked on it was to add each sample while moving forward, and then subtract samples moving backwards, but that generally gave significantly louder pops both times the instruments samples changed direction.

So, if an instrument with loop mode 1 have a sample set with total sample length = 10000 and loop length = 9000 we end up with 3 segments:

  1. The first 1000 samples are played once from 0 to 1000. Each sample is just added to the previous sample.
  2. The next 9000 samples, from 1000 to 10000, are also played from start to end like the first segment. Each sample is also here added to the previous one.
  3. Now we have hit end of the sample list and start to read samples backwards. This is where I tried to subtract instead of adding samples all the way back to sample number 1000, where we jump to step 2 again. This did not give any good results, so I therefore landed on adding samples also in this step and then just set sample = 0 when jumping back to step 2 and start adding samples again.

If I understand you correctly you propose to add and subtract samples in fixed intervals? I tried just now to see what happens if I add X samples and then subtract the next X samples independently of loop mode (with a few different sizes of X), but it does not sound correct (volume tends to spike for certain instruments and there are some other audible distortions). Since I have never worked with DPCM coded samples before I do not really know how this is supposed to work and is open for all suggestions.

PS: It is very easy to play around with this DPCM decoding in emusc, just look at the method bool Partial::_next_sample_from_rom(float pitchAdj) in partial.cc for how to add or subtract the samples in different loop modes.

mmontag commented 6 months ago

Sorry my example wasn't clear, those are meant to represent a stream of consecutive audio frames. (The signal either increases + or decreases - from one sample to the next.) But anyway, it sounds like you already tried that (subtracting the delta when going backwards).

mmontag commented 6 months ago

I have a fix for this. But FYI my suggestion was wrong; the ping-pong loop points are at zero crossing, so in order to make a nice continuous signal, the backwards traveling loop should have inverted polarity.

skjelten commented 6 months ago

I have a fix for this. But FYI my suggestion was wrong; the ping-pong loop points are at zero crossing, so in order to make a nice continuous signal, the backwards traveling loop should have inverted polarity.

The thing that I fail to understand is how to get zero crossing when the loop point is X samples from the start. If we where looping the entire sample list, then it would make perfect sense to add going forward and subtract going backwards. This way you would have a complete cycle for each loop. On the SC-55 you have the loop point somewhere inside the sample lists. I guess you could consider the loop point the 0-point, but my tests for this actually resulted in louder pops than just adding going both ways and resetting when completing a cycle.

An extra piece of information that I can add (even though it might not be relevant for this issue) is that there is one field left in the sample definition that we do not know the meaning of. This is a 16 bit number that seems to indicate an alternative sample length. In Kitrinx's spreadsheet it is called Unkown (column I and J in the samples tab) and in emusc Control ROM information dialog it is called Attack End (but should be renamed to unknown). Notice that the value of this field is always less than sample length and always 0 when loop mode = 2 (no looping).

If you have any fix for the DPCM issue, or have an idea about the purpose of the last unknown field in the sample definition, I am very interested in hearing it!

mmontag commented 6 months ago

N.B.: from now on, I will refer to a point in the audio data as a "frame." The conversation is confusing when "sample" means two different things.

Yep, I am still working on a patch. But I found more edge cases to investigate. Everything is yet to be confirmed.

I dumped all the PCM ROM to WAV files, focusing on the ping-pong samples (loop mode=1). After messing around in a wave-editor software for a while, I could make flawless ping-pong loops by deleting the first frame (often a zero) and duplicating the last frame. Without this fix, audible pops.

Duplicating the last frame is not a real fix, but a clue.

Theory 1: There is an off-by-one error somewhere in the sample rom address calculation.

Theory 2: It should not be necessary to reset the signal to 0 at the end of a loop. When applying the DPCM accumulation, the sum should arrive back at 0 by itself.

Theory 3: The "attack end" value is not especially relevant to the ping-pong loops.

skjelten commented 5 months ago

N.B.: from now on, I will refer to a point in the audio data as a "frame." The conversation is confusing when "sample" means two different things.

Agree, that is a common term everyone understands. Currently forcing myself to work on documentation on the wiki and will keep this in mind when writing.

Yep, I am still working on a patch. But I found more edge cases to investigate. Everything is yet to be confirmed.

I dumped all the PCM ROM to WAV files, focusing on the ping-pong samples (loop mode=1). After messing around in a wave-editor software for a while, I could make flawless ping-pong loops by deleting the first frame (often a zero) and duplicating the last frame. Without this fix, audible pops.

Duplicating the last frame is not a real fix, but a clue.

Sounds like you are on the right track, looking forward to future updates when you have had more time!

mmontag commented 5 months ago

BTW, theory 2 makes sense if you remember that the DPCM is derived from a looped PCM stream. The very last value in the DPCM stream should be the difference between the last and first values of the PCM looped region. So any drift from DC that has to be corrected is a sign that something is wrong.

skjelten commented 5 months ago

BTW, theory 2 makes sense if you remember that the DPCM is derived from a looped PCM stream. The very last value in the DPCM stream should be the difference between the last and first values of the PCM looped region. So any drift from DC that has to be corrected is a sign that something is wrong.

Yes, that does indeed sound very reasonable.

Another tip that I should have given you a week ago was to look at Kitrinx's sound font. If you read the source code you can see how she handled looping samples. I believe the sound font have no popping sound issue on looping samples the way emusc have, albeit with 2xRIAA filtering. You might have looked at that already, if not, it could be useful.

mmontag commented 5 months ago

Alright, I posted a PR here: https://github.com/skjelten/emusc/pull/53

This PR does a couple things, but hopefully the net result is the forward-loop instruments sounds the same as they used to, and the ping-pong instruments are now without pops.

Also came across a bunch of other things I'd like to update so you might see a few more PRs from me :) Yeah, I found Kitrinx's source and this thread on vogons https://www.vogons.org/viewtopic.php?f=62&t=76613&start=360; very useful.

skjelten commented 5 months ago

Alright, I posted a PR here: #53

This PR does a couple things, but hopefully the net result is the forward-loop instruments sounds the same as they used to, and the ping-pong instruments are now without pops.

Thanks for the PR!

I just posted a response to it as I had some issues with certain instruments. I will try to investigate it myself also when time permits.

Also came across a bunch of other things I'd like to update so you might see a few more PRs from me :)

Excellent! :slightly_smiling_face:

skjelten commented 5 months ago

Closing this issue as PR #53 has been merged.

mmontag commented 5 months ago

Just an aside, came across this mention of frame vs. sample terminology over at Nuked SC 55: https://github.com/nukeykt/Nuked-SC55/pull/68#discussion_r1589880882

johnnovak commented 4 months ago

Just an aside, came across this mention of frame vs. sample terminology over at Nuked SC 55: nukeykt/Nuked-SC55#68 (comment)

Yeah, using frames and samples in variable and function names interchangeably confuses the hell out of me, plus it's quite error-prone 😅