stardot / b-em

An opensource BBC Micro emulator for Win32 and Linux
http://stardot.org.uk/forums/viewtopic.php?f=4&t=10823
GNU General Public License v2.0
118 stars 58 forks source link

Change in behaviour with bit-banged sound between releases 8761065 and 608eb15 #219

Open NegativeCharge opened 10 months ago

NegativeCharge commented 10 months ago

I'm attaching an .ssd to demonstrate the issue which occurs when playing software generated bass through interrupts. There appears to be sound corruption introduced between these two releases: https://github.com/stardot/b-em/compare/vdfs%23186...integra210

I see in 6502.c the introduction of sound_poll(c); which may or may not be related.

The corruption can be heard within 10 seconds of running the test.ssd on the integra210 release or newer, but doesn't occur on the vdfs#186 release or earlier. Hope this helps track down the issue.

test.zip

SteveFosdick commented 10 months ago

I think I can reproduce this, though I am not completely sure what I am listening for, and I think the change in behaviour occurred with commit https://github.com/stardot/b-em/commit/8e92d061dce18b4118b99dabea302553389502e3

It is worth reading Sarah's commit comment. As this commit was intended to make the SN76489 emulation more like real hardware I wonder if you have run the test on real hardware too?

NegativeCharge commented 10 months ago

Thanks - I've only tested on an FPGA (don't have real hardware) and that works fine. Also the output from other emulators (b2, jsbeeb, BeebEm) matches the output I hear in the version of B-em prior to that update.

There should be a clear difference in the output you hear from the executable compiled prior to the update and that after the update.

SteveFosdick commented 10 months ago

Ok, for reference, here is an MP3 file of the test playing on a real BBC Model B. b-em-issue219.zip It's a ZIP file because GitHub won't let me upload an MP3 so the MP3 is in the ZIP file.

SteveFosdick commented 10 months ago

Listening to the two versions, the most obvious difference is that the newer version of b-em has some random sounds that are not in the correctly played version.

As an aid to debugging this, I have pushed a new branch to GitHub: https://github.com/stardot/b-em/tree/sf/soundrec which enables the output of the normal sound generation to be written to a WAV file. That branch is based on the current master, and shows the odd behaviour playing the test tune, but my next step is to use another branch to try reverting the changes in the commit I have identified and see what happens.

As far as I can see, there are three parts to the change:

  1. Change how often the sound generation part of the SN76489 emulation us run. In the older version, this is done every 128 2Mhz cycles so 15,625 times/second, whereas in the new version it is updated every 16 2Mhz cycles, so 125,000 times/second. Also, in the older version, the sound generation is only done at a 6502 instruction boundary, on the new version it can happen even during the middle of an instruction.
  2. Move when the internal state of the SN76489 is updated when writing to IC32 via the system VIA. In the old version, the update happens immediately IC32 is written to, in the new version it happens in the 125Khz poll any time the chip select line is held high, i.e. IC32 & 1
  3. Correction to the SID emulation running at the wrong sample rate.

I doubt the last of those is relevant to this problem so one possible approach here is to see whether the first two changes are dependent or can be separated and, if they can be separated, which is causing the issue.

SteveFosdick commented 10 months ago

Thinking that the point at which the emulated SN76489 latches the data value is likely to be at the core of this issue, I though it was worth writing a smaller test case, or at least one which I understand better so here goes:

   10 REM > SWRITE
   20 S%=256
   30 DIM C% S%
   40 E%=C%+S%
   50 XORB=&FE40
   60 XORA=&FE41
   70 DDRB=&FE42
   80 DDRA=&FE43
   90 FOR N%=0 TO 3 STEP 3
  100   P%=C%
  110   [OPT N%
  120   .silence_all
  130   LDX #&00
  140   JSR silence_one
  150   LDX #&20
  160   JSR silence_one
  170   LDX #&40
  180   .silence_one
  190   TXA
  200   OR A #&80
  210   JSR slow_wbyte
  220   TXA
  230   JSR slow_wbyte
  240   TXA
  250   OR A #&9F
  260   .slow_wbyte
  270   PHP
  280   PHA
  290   SEI
  300   LDA #&FF
  310   STA DDRA
  320   PLA
  330   STA XORA
  340   LDA #&00
  350   STA XORB
  360   NOP
  370   NOP
  380   NOP
  390   NOP
  400   NOP
  410   NOP
  420   NOP
  430   NOP
  440   LDA #&08
  450   STA XORB
  460   PLP
  470   RTS
  480   .fast_wbyte
  490   PHP
  500   PHA
  510   SEI
  520   LDA #&FF
  530   STA DDRA
  540   PLA
  550   STA XORA
  560   LDA #&00
  570   STA XORB
  580   LDA #&08
  590   STA XORB
  600   PLP
  610   RTS
  620   .slow_test
  630   LDA #&80
  640   JSR slow_wbyte
  650   LDA #&20
  660   JSR slow_wbyte
  670   LDA #&90
  680   JSR slow_wbyte
  690   RTS
  700   .fast_test
  710   LDA #&80
  720   JSR fast_wbyte
  730   LDA #&30
  740   JSR fast_wbyte
  750   LDA #&90
  760   JSR fast_wbyte
  770   RTS
  780   .merged_slow
  790   PHP
  800   SEI
  810   LDA #&FF
  820   STA DDRA
  830   LDA #&80
  840   STA XORA
  850   LDA #&00
  860   STA XORB
  870   NOP
  880   NOP
  890   NOP
  900   NOP
  910   NOP
  920   NOP
  930   NOP
  940   NOP
  950   LDA #&10
  960   STA XORA
  970   NOP
  980   NOP
  990   NOP
 1000   NOP
 1010   NOP
 1020   NOP
 1030   NOP
 1040   NOP
 1050   LDA #&90
 1060   STA XORA
 1070   NOP
 1080   NOP
 1090   NOP
 1100   NOP
 1110   NOP
 1120   NOP
 1130   NOP
 1140   NOP
 1150   LDA #&08
 1160   STA XORB
 1170   PLP
 1180   RTS
 1190   .merged_fast
 1200   PHP
 1210   SEI
 1220   LDA #&FF
 1230   STA DDRA
 1240   LDA #&80
 1250   STA XORA
 1260   LDA #&00
 1270   STA XORB
 1280   LDA #&18
 1290   STA XORA
 1300   LDA #&90
 1310   STA XORA
 1320   LDA #&08
 1330   STA XORB
 1340   PLP
 1350   RTS
 1360   ]
 1370   IF P%>=E% PRINT"Code too big":STOP
 1380 NEXT
 1390 CLS
 1400 REPEAT
 1410   PRINT"1. Individual WE strobes, with delay"
 1420   PRINT"2. Individual WE strobes, no delay"
 1430   PRINT"3. Single WE strobe, with delay"
 1440   PRINT"4. Single WE strobe, without delay"
 1450   PRINT"5. Silence sounds"
 1460   PRINT '"? ";
 1470   G%=GET
 1480   IF G%>=&20 AND G%<= &7E PRINT CHR$(G%)
 1490   IF G%=49 CALL slow_test
 1500   IF G%=50 CALL fast_test
 1510   IF G%=51 CALL merged_slow
 1520   IF G%=52 CALL merged_fast
 1530   IF G%=53 CALL silence_all
 1540 UNTIL FALSE

So the idea here is to test whether the SN76489 needs the /WE line returned to inactive and the re-activated for each byte or whether it will accept subsequent bytes with it held active. Also to test the effect of waiting the documented 32 (4Mhz) clock times for it to latch the data or not.

The results on real hardware are that cases 1 (Individual WE strobes, with delay) and 3 (3. Single WE strobe, with delay) produce the expected sound at full volume. Case 2 (Individual WE strobes, no delay) produces a brief blip and case 4 (Single WE strobe, without delay) produces a tone but at reduced volume, i.e. it seems to have lost the update to the attenuator.

On the more recent version of B-Em, cases 1 and 3 work as expected, cases 2 and 4 produce no sound at all.

What I suspect is going on is that where the 32 clock pulses is counted from varies between this version of B-Em and the real hardware. My reading of the datasheet is that the count of 32 starts from when /WE becomes active so if a group of writes are made together with /WE active the count is synchronised to the first one. In B-Em the state of /WE is being checked every 32 (4Mhz) cycles (in the code it is 16 for 2Mhz cycles) in a fixed pattern from when the emulator started and asynchronous to when /WE becomes active.

I'll continue with this when I have more time but, in the mean time, knowing what your player code does, does this seem like something that would affect it?

NegativeCharge commented 10 months ago

Thanks Steve for looking into this. It's very possible the code is affected by this.

You can see Simon's original source here: https://github.com/simondotm/vgm-player-bbc/blob/master/lib/vgcplayer_bass.asm

There's also a compiled .ssd in the root of the master branch that exhibits exactly the same issue with random tones and glitches in the sound when played using a newer version of b-em.

SteveFosdick commented 10 months ago

Looking at the source above, I think the problem is slighly different.

The datasheet specifies that it takes the SN76489 "approximately" 32 (4Mhz) clock cycles to update the internal registers but I am not sure how much leeway approximately gives. This 32 cycles also seems to be timed from /CE which, on the BBC Micro, is permanently true.

In that source code, there is a delay but it looks to be shorter than the full 16 2Mhz cycles which means if B-Em is updating the register every 16 cycles sometimes that will be soon enough and sometimes it won't depending on how out-of-sync that polling is with the writes. As this works on real hardware I wonder if the delay was determined by testing rather than by referring to the data sheet.

So, I tried re-adding the immediate update of the SN76489 on /WE going active (in sysvia.c) and, to my ear, the player now works correctly. I have pushed a branch: https://github.com/stardot/b-em/tree/sf/issue219c with this and also built Windows executables in a pre-release: https://github.com/stardot/b-em/releases/tag/issue219c

Can you let me know, if you agree that it sounds correct with this fix?

NegativeCharge commented 10 months ago

Thanks Steve. That does indeed fix the artefacts I was hearing.

There's some potentially relevant detail on this here:

https://www.stardot.org.uk/forums/viewtopic.php?p=243442#p243442

"The SN76489 datasheet specifies that it requires 32 clock cycles to load data into the control register (at 4MHz), so this agrees with the AUG's 8µs, or 16 clock cycles. We can see that the registers are effectively clocked at 250kHz, which would imply that you'd only technically need 4µs in order to catch an active clock edge"

...and here: https://www.stardot.org.uk/forums/viewtopic.php?p=123238#p123238

"According to my research, the sound chip latches new data somewhere around 8-16 CPU cycles after a high-low transition of the WE pin."