ADPCM: fixed end/limit address checking

hyano commented 2 years ago

It is a fixed patch I have sent in PR #31 first. In the original behavior, ADPCM sounds doesn't play with loop if stop address and limit address are same. I reffered the behavior and implementation of fmgen.

https://github.com/aaronsgiles/ymfm/pull/31#issue-1190632498

This patch fixes checking of end and limit addresses in ADPCM.

In the original implementation, loop playback of ADPCM sound was sometimes incorrect because the limit address checking was done first and it changed the data address to 0.

hyano commented 2 years ago

I took an old PC88 out of storage to check OPNA behavior with experimental codes.

My observations were:

STOP ADDRESS checking is done before LIMIT ADDRESS checking.
- I set same address to STOP ADDRESS and LIMIT ADDRESS and playback with REPEAT bit.
- Then, playback position went back to START ADDRESS, not 0 and sound had been played repeatly.
STOP/LIMIT ADDRESS checkings is done with "equal", not "greater".
- Even if STOP ADDRESS is smaller than START ADDRESS, sound was played till LIMIT ADDRESS and goes back to 0 and sound was played till STOP ADDRESS. (START -> LIMIT -> 0 -> STOP)
- Even if LIMIT ADDRESS is smaller than START/STOP ADDRESS, sound was played normally.

I would like to check more carefully or experiment with more complex cases, but unfortunately PC88 was broken during experiment...

aaronsgiles commented 2 years ago

Thanks for the hardware confirmation. This is a great verification of the start/stop/limit behavior.

The only thing I'm not 100% convinced of yet is the address shifting change. I had originally implemented this by shifting the current address down and comparing against the limit/stop values:

if ((m_curaddress >> address_shift()) >= m_regs.limit())

You changed that to

if (m_curaddress == ((m_regs.limit() + 1) << address_shift())

Based on your tests, the change from >= to == makes sense, but in your version you are playing (1 << address_shift()) more samples before looping. Is this confirmed? Perhaps there are examples that require the extra samples to be played to prevent artifacts?

Also, the +1 doesn't really work cleanly in hardware, and there are edge cases to be concerned about, if m_regs.limit() is maximum value. We'd want to confirm that.

Is it possible it could be something like this?

if (m_curaddress == ((m_regs.limit() << address_shift()) | ((1 << address_shift()) - 1)

That would stop/loop on the last sample of the block rather than the one after it, and I could see this logic mapping cleanly to hardware.

hyano commented 2 years ago

I understood your points.

Differnece is when to check, after imcrement m_address or before increment.
What kind of implementation is close to HW's.

but in your version you are playing (1 << address_shift()) more samples before looping.

No. My version checks address AFTER increment m_address in previous clock() and BEFORE fetching next sample as same as yours. So, no extra block of samples would be played. It is same approach with fmgen.

I think this implementation is simple from a SW perspective. But as you wrote, it is probably different from HW implementation.

To confirm edge cases, I want to check the behavior of "LIMIT/STOP = START - 1" case, but I can't due to HW failure... I expect sound will be played till the end of memory.

Is it possible it could be something like this?

if (m_curaddress == ((m_regs.limit() << address_shift()) | ((1 << address_shift()) - 1)

That would stop/loop on the last sample of the block rather than the one after it, and I could see this logic mapping cleanly to hardware.

I agree, '+1' is not clear for HW implementation and your idea would match with actual chip implementation, too. And It closes to the explanation of start/stop addressing on the YAMAHA application manual which says：

BANK / CAS(A8-A4)	CAS(A3-A0) - RAS(A8-A5)	RAS(A4-A0)
START ADDRSSS (H)	START ADDRESS (L)	0 0 0 0 0
STOP ADDRSSS (H)	STOP ADDRESS (L)	1 1 1 1 1

pseudo code based on your idea:

at_limit()
{
  return (m_curaddress == ((m_regs.limit() << address_shift()) | ((1 << address_shift()) - 1);
}

external_read(m_curaddress) and process sample data
// need to take care of m_curnibble, too
if (at_end()) {
  stop or loop
  set EOS status bit
} else if (at_limit()) {
  m_curaddress = 0
} else {
  m_curaddress++
}

Does it match with your idea? I think the care of m_curnibble is also needed. (Now, I noticed it same for my current version, too:) ) Currently, I'm not confident when EOS/PCM BSY status bits are updated, on fetching data from memory or after processed.

In case of fm.c(fmdeltat.c), there is unclear point. Stop/limit calculation is almost same as your idea. On the other hand, address checking is done BEFORE fetching sample data. In this case, the last 1 byte seems to be never used. I think it doesn't match with HW designer's original intention.

P.S. While I investigated memory access behavior including dummy read, I felt OPNA's ADPCM state management is very sensitive. For example, if read/write count doesn't match with the access unit (32B in x8 mode), chip state becomes unstable&unrecovable easily and need chip reset. It is not easy to find out internal address registers/unit and state...

aaronsgiles commented 2 years ago

I see what you mean. I am glad you shared the documentation that shows the hardware checks for 11111 in lower bits; that confirms my suspicion.

I think your pseudo-code is good. We should also fix the at_end() to work the same way.

bool at_limit() const { return (m_curaddress == ((m_regs.limit() << address_shift()) | ((1 << address_shift()) - 1))); } bool at_end() const { return (m_curaddress == ((m_regs.end() << address_shift()) | ((1 << address_shift()) - 1))); }

I agree there is a question about the m_curnibble and the timing of EOS, but we would need hardware to confirm it. With your proposed code it would signal one sample early. In practice, I don't think this will matter much.

I don't understand your comment about ADPCM-A, but I think the same logic should be used there as well. Are you saying that the ymfm implementation is unstable, or the real chip is? If it's ymfm, can you give a more specific example of how to get it into the bad state?

hyano commented 2 years ago

I see what you mean. I am glad you shared the documentation that shows the hardware checks for 11111 in lower bits; that confirms my suspicion.

You can refer the specs for START/STOP in "YM2608 OPNA Application Manual" on p.48. https://www.quarter-dev.info/archives/yamaha/YM2608_Applicatin_Manual.pdf It is an official manual from YAMAHA.

I agree there is a question about the m_curnibble and the timing of EOS, but we would need hardware to confirm it. With your proposed code it would signal one sample early. In practice, I don't think this will matter much.

Agree. I think few people uses EOS bit for accurate checking. The timing and condition of EOS are unclear in other cases, too.

I don't understand your comment about ADPCM-A, but I think the same logic should be used there as well. Are you saying that the ymfm implementation is unstable, or the real chip is? If it's ymfm, can you give a more specific example of how to get it into the bad state?

Sorry for confusion. What I wrote is for REAL OPNA's ADPCM (ADPCM-B). The REAL OPNA's behavior for abnormal cases is unpredictable and it confuses me. (I'm trying to verify suspected internal logic with abnormal input sequences.)

And sorry, I've not read ymfm's ADPCM-A code, yet. As you wrote, I would expect the same logic to be used for RHYTHM/ADPCM-A as well.

For double checking, I Google'd YM2610's app manual, but I couldn't find official one. I have YM2610 datasheet only. It doesn't contain detailed information.

hyano commented 2 years ago

@aaronsgiles I updated my branch based on our disucssion.

I think it almost matches with my pseudo-code I showed above. And m_curnibble is handled as well.

ADPCM-A part is not updated. A return value of adpcm_a_channel::clock() seems to be delicate. The current code looks similar to my original ADPCM-B code.

So, could you review the code? Of course, you are welcome to update the branch directly.

And one thing to be considered is when to fetch CPU-driven data. I think the following block should be moved to the same location as calling m_owner.intf().ymfm_external_read() from a symmetry point of view.

        // if CPU-driven, copy the next byte and request more
        else
        {
            m_curbyte = m_regs.cpudata();
            m_status |= STATUS_BRDY;
        }

It is currently in its original location. I don't know a good reference of this feature and feel it is not easy to confirm with real chip. Do you have any opinion?

Although it is out of scope of this PR, I think BDRY bit should be set after processing every "4bit x 2" in clock().

If code wil be OK, I'll update the code for #35, too.

hyano commented 2 years ago

And m_curnibble is handled as well.

Sorry, processing for the last 4 bit is wrong. m_accumulator will be overwritten. Needed to stop processing if at_end() is taken.

But, this approach still miss the last 4 bit. I have another option to chack and increment m_curaddress just before fetching next data.

if (at_end()) {
  stop or loop
  set EOS status bit
  RETURN
} else if (at_limit()) {
  m_curaddress = 0
} else {
  m_curaddress++
}
external_read(m_curaddress) and process sample data

Timing to check m_address is same as the original code. Defferences are at_end(), at_limit() and when to increment m_curaddress. If you agree to this approach, I'll try to update.

And maybe this approach can be used ADPCM-A, too. Sorry for taking your time.

aaronsgiles / ymfm

ADPCM: fixed end/limit address checking #33