ramapcsx2 / gbs-control

GNU General Public License v3.0
793 stars 111 forks source link

random black screen in passthrough mode #461

Closed nyanpasu64 closed 1 year ago

nyanpasu64 commented 1 year ago

discord, 04/15/2023:

gbs-c in custom preset -> passthrough, playing nintendont, went black for a second and restored unsure if crt or gbs-c failed no message in logs (computer was awake but locked, unsure if logs were making it to the browser properly)

(wii, monkey ball 2 at 480p)

also wii, mkwii at 480p

possible causes

PXL_20230513_102519143 (flick the metal bit to make the signal cut out)

nyanpasu64 commented 1 year ago

grrrrr i swapped the SoG caps and still got a dropout on skyward sword i suspect the issue this time is not related to sog, despite the symptoms matching but perhaps caused by custom passthrough (which was not working at all prior to my recent changes)

nyanpasu64 commented 1 year ago

spent a few hours on ttyd, while recording the vga output. and no video cutouts. the bug is playing games, hiding from me

i wonder if the bug appears if you power-cycle the console after loading custom passthrough, not the other way around

or perhaps during passthrough, turn off and on the console quickly before the GBS-C resets and starts scanning for inputs (at which point, turning on the console is not much different from loading custom passthrough anew).


syncWatcherEnabled is still enabled during fixed passthrough, unless turned off in developer tab. does it do anything?


"save filtering per slot" and loading a preset, is triggering reentrant wifi handling during i2c comms (#421).

09:47:32.520 -> user command 3 at settings source 2, custom slot 66, status 21
[applyPresets]
    [loadPresetFromSPIFFS]
        09:47:32.520 -> preferencesv2.txt opened
        09:47:32.520 -> loading from preset slot B: /preset_ntsc_480p.B
[setOutModeHdBypass]
    [externalClockGenResetClock]
        09:47:32.619 -> clock gen reset: 81000000
    [doPostPresetLoadSteps]
        09:47:32.619 -> ADC offset: R:43 G:43 B:42
        [externalClockGenResetClock]
            09:47:32.619 -> clock gen reset: 81000000
    [???]
        ["/gbs/restore-filters"]
            09:47:32.784 -> slot: 66
            09:47:32.784 -> scanlines: on (Line Filter recommended)
            (perform flash operations)
            (muck with uopt)
            TODO who reads uopt?
    [applyPresets]
        [setOutModeHdBypass]
            09:47:33.314 -> pass-through on

I may continue testing with "save filtering per slot" disabled.

nyanpasu64 commented 1 year ago

I got a recording of the output. The screen flashed black for part of a frame, but didn't lose sync persistently.

black screen glitch.zip

(this particular glitch didn't cross a vsync pulse so it didn't make my monitor go black for half a second) to me it looks like the video scaler (GBS-C, TV5725) briefly mistook the 31.5khz 480p signal for a 63khz or so signal, then went black, then started displaying 63khz video again and the next frame, output a 31khz signal again unfortunately i did not capture hsync, but the video display signal looks quite like 63khz to me

Screenshot_20230524_104029

Screenshot_20230524_104056

nyanpasu64 commented 1 year ago

i have some interesting observations:

b3-layers.zip

do I have a defective Wii where the GPU shits its pants randomly for a single frame, and fails to render 3D graphics, and sometimes outputs bad sync as well?!

another possibility is that the TTYD black frame and the sync loss are separate bugs.

ramapcsx2 commented 1 year ago

Did you just oscope your video signal with your sound card, or what is that .flac file? xD So some points: If you have passthrough configured, the entire operation is a pure hardware chain. The input gets ADC'd, along with the syncs, and this is used unmodified for the output (hence passthrough. it passes through all the processing blocks without them doing anything). There are still some configurable elements, especially on the syncs, and please look up what the code configures to get an idea. I think part of the segment 5 syncs registered are used.

nyanpasu64 commented 1 year ago

Not directly related to this bug, but I found that when switching to fixed passthrough from a reboot or scaling preset, it takes >1 second longer than normal, because TEST_BUS_2F returns all zeros and when the program tries to run optimizeSogLevel, it always fails at outer test failed syncGoodCounter: 0. This is despite TEST_BUS_SEL = 10, TEST_BUS_SP_SEL = 15 like expected.

If I try again, switching from passthrough to fixed passthrough, it runs rto->thisSourceMaxLevelSOG = rto->currentLevelSOG = 14; and the test bus works as-is and setAndUpdateSogLevel is never called. (Unless you load preset right after powering off the Wii, then it fails to optimize SOG again.)

Separately, I have another idea on fixing custom passthrough (enabling passthrough mode on 480p input resolutions, and scaling on 240p inputs). Perhaps when restoring a passthrough preset, instead of reading all registers I'd just read ADC gain (and offset?) alone, then apply passthrough as usual. Though it may trigger the above mishap as well.

Also I tried cycling SOG Level-- from 16 through 0, in both fixed and custom passthrough, and saw no immediate sync glitches on monitor or Audacity (only the screen moving horizontally slightly), the output hsync and vsync waveforms looked perfectly periodic. I have not been able to reproduce a glitch on my LCD monitor; I'm ordering a VGA distribution amp to power both my CRT and LCD (and additionally probe the signal like before), since I'm starting to suspect my CRT fails to sync or interprets a marginal h/v sync as losing sync (while my LCD is more lenient?).

Is the GBS-C's output signal in-spec? I found the GBS-C outputs 2 line duration vsync in 480p passthrough mode (matching DMT 480p), but the vertical position is off by 3 lines. (the funny numbers in setOutModeHdBypass) I don't have the horizontal precision to probe horizontal timings.

GBS::HD_HB_ST::write(0x878); // 1_3B GBS::HD_HS_SP::write(0x864); // 1_41

I might also look into high-resolution video capture if I buy a https://github.com/happycube/cxadc-linux3 PCIe card.

Also horrifyingly the Wi-Fi server can process "/gbs/restore-filters" during optimizeSogLevel's call stack.

ramapcsx2 commented 1 year ago

Nothing on the output is much in spec, I can tell you that :) SoG level is the detection threshold of the sync portion, that after the analog input circuit (the SoG caps!). It is required to be tuned in many situations (source consoles), but not all of them. These sync pulses vary wildly between consoles.

ramapcsx2 commented 1 year ago

"are you pulling hsync low before hblank begins?" << please make sure you understand this fully. There are settings called "hb" that are not actually the output hblank period, but something else on the scaling chain, etc.

nyanpasu64 commented 1 year ago

Are you interested in changing the 480p passthrough sync pulse positions to match VGA DMT timings better?

ramapcsx2 commented 1 year ago

Please remind me which segment HD_HB_ST is in, but "Generate horizontal blank to select programmed data." sounds like it works on the input formatter, preparing the image in digital format for processing later. If this is the case, eventhough it's called "horizontal blank", it doesn't have to do with any syncs or timings, it's merely a mask on the digital data. (And as such, no, I have no idea why it does anything in bypass mode :) )

nyanpasu64 commented 1 year ago

It's in the HD Bypass section:

Screenshot_20230526_102348

In any case I'm working on changing the sync pulse positions and sizes to match VGA (my auto-modeline program based off tomverbeure's timings database), in the hopes that this will prevent my CRT from losing sync.

> auto-modeline print dmt 640 480 60
"auto-dmt-640-480-60"  25.2  640 656 752 800  480 490 492 525  -HSync -VSync
# (752 - 656) / 800 = 0.12

Did you just oscope your video signal with your sound card, or what is that .flac file? xD

yep lmao. I'm blessed with a 192khz motherboard with practically no detectable aliasing in mdfourier or recording sync signals with harmonics beyond Nyquist, I assume because it has an oversampled frontend. (I didn't try recording video or hsync signals wholly beyond Nyquist, then checking the noise floor.) A real DSO would be better of course, but this is sufficient for testing the positions of vsync pulses and the regularity (but not contents) of hsync pulses.

Vertical timings

VGA DMT 640x480 vsync 480 490 492 525 has two lines of vsync.

Probing the GBS-C's passthrough signal with my audio interface, I found that the vsync pulse is 2 lines long, but 3 lines late relative to 480-line-tall video content (for example NIntendont's menu).

To fix this, I had to change:

setCsVsStart(525 - 5);
setCsVsStop(525 - 3);

Horizontal timings

VGA DMT 640x480 hsync 640 656 752 800 is low for 0.12 scanlines:

(752 - 656) / 800 = 0.12

I calculated that with the GBS-C in 480p passthrough, the horizontal sync line is low for 0.09 scanlines:

GBS::PLLAD_MD::write(2345); // 2326 looks "better" on my LCD but 2345 looks just correct on scope
GBS::HD_HS_ST::write(0x10);  // 1_3F
GBS::HD_HS_SP::write(0x864); // 1_41
// 1 + (0x10 - 0x864) / 2345 = 0.09083155650

It's possible that this is preventing my CRT from picking up sync. But then again, CVT 640x480 has a 0.08 scanline long hsync pulse (but at a lower hsync rate, so each pulse is longer), and works fine (I did not verify it doesn't lose sync every few hours):

> auto-modeline print cvt 640 480 60
"auto-cvt-640-480-60"  23.75  640 656 720 800  480 483 487 500  -HSync +VSync

(720 - 656) / 800

gtf 640 480 60 has the same horizontal timings as CVT, but I did not test that it works on my display. Interestingly both CVT and GTF 640x480@60 have a hsync rate below 30 kHz, but my monitor doesn't complain. I hope I won't burn it out running it at that.

In any case I prefer to leave hsync at 0.12 scanlines long (leaving the low pulse time unchanged, increasing the high pulse-end time), since it centers the screen the same way my computer's video output does.


Do you know why optimizeSogLevel fails when loading a passthrough preset for the first time? Is setOutModeHdBypass -> doPostPresetLoadSteps checking for test bus, before setOutModeHdBypass finishes initializing the registers required for sync to work? And applying bypass twice, or loading a complete set of bypass registers before setOutModeHdBypass, prevents this from happening? (I could test this by calling optimizeSogLevel repeatedly at different points and seeing when it works.)

I still don't know why my display was cutting out. And I don't know if increasing hsync pulse width was the fix or not, and it will take hours to even attempt to rule it out. Perhaps I could decrease hsync pulse width and see if it reproduces the issue quickly (meaning the old width was marginal)?

ramapcsx2 commented 1 year ago

Because of time, as always, I really cannot reply to everything :p I want to convey to you though that you seem to think of the video specifications, modelines and such, way too strict. Vsync / Hsync are a continous stream in an analog line. The displays and scalers care about some relationships between them, but they have only a general concept of where the video should be, versus blanking. This is to say, you are free to move Vsync around, make it longer, or shorter (not too short), and it will be "legal" on all devices still. The specifications and modelines are there as a guideline, and if you manage to follow them to the letter, your chances of an instantly well positioned picture increase, but there is no further benefit.

nyanpasu64 commented 1 year ago

I've found 3 different issues:

The GBS-C firmware is in setAndUpdateSogLevel(14) mode. I'm not sure if changing this value would affect it, or increasing or decreasing the voltage threshold would change the likelihood of sync errors.

Recordings at hsync rate 4div3.zip. I suggest viewing in Audacity at around 1.2 to 1.3 seconds, in spectrogram view with length 32. Unfortunately I do not have high-resolution captures of input luma.

Screenshot of output 2023-06-01 15:18:11, showing output hsync pulses at 4/3 the input frequency:

Screenshot_20230601_163252

ramapcsx2 commented 1 year ago

Okay, you want to look at every possible option that is active in passthrough mode. That would be all the section 5 sync processor registers, as well as the HD bypass stuff at least. For HD bypass, all I know is that this is digital data processing, but it may have some effect on syncs. The sync processor has all these "SoG" and window parameters, and I was never able to fully map out what everything does, or whether it is active indeed (or when it is active, whether it only applies to some modes and not others). You could try to compare some segment 5 dumps and maybe try a few different combinations of those filter settings. Maybe something works :)

nyanpasu64 commented 1 year ago

i think the problem is that custom passthrough just gets so many registers wrong compared to fixed passthrough, so many registers different that it's a pain to even list them all honestly i don't want to go through the trouble of fixing it anymore

screenshot, custom passthrough (loses sync) on the right: Screenshot_20230601_233721

const char HEX_DIGITS[] = "0123456789ABCDEF";

void print_hex(uint8_t val) {
    SerialM.print(HEX_DIGITS[(val >> 4) & 0xf]);
    SerialM.print(HEX_DIGITS[val & 0xf]);
}

void print_page(uint8_t page, size_t begin = 0, size_t end = 256) {
    // SerialM.printf("Page %d:\n", page);

    for (size_t row = begin; row < end; row += 16) {
        SerialM.printf("S%c_", HEX_DIGITS[page]);
        print_hex(row);
        SerialM.print(": ");

        for (size_t col = 0; col < 16; col += 1) {
            auto addr = row + col;

            uint8_t val;
            GBS::read(page, addr, &val, 1);
            print_hex(val);
            SerialM.print(' ');
        }
        SerialM.println();
    }
    SerialM.println();
}

void print_regs() {
    SerialM.println("       _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F ");
    SerialM.println("------ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- ");

    print_page(0, 0x40, 0x60);
    print_page(1, 0x00, 0x90);
    print_page(5, 0x00, 0x70);
}

There are so many different bits, each of which could plausibly explain losing sync:

ramapcsx2 commented 1 year ago

I don't think it's about getting them wrong, compared to some other "reference". It's more about figuring out whether any of the options have an effect or not, and if one does, picking the best value.

ramapcsx2 commented 1 year ago

IF_HSYNC_RST is a digital video data "reset" signal, iirc it meant where a line ends, so modifying it might make more or less of your image show. It shouldn't do actual sync stuff. They just used similar terms here :p

PLL_IS you mostly want to leave the core system parameters like these alone, especially any dividers / multipliers of clocks. But there could be a PLL voltage register somewhere, that's worth taking a look at.

nyanpasu64 commented 1 year ago

Fixing loading custom passthrough

I've locally modified the GBS-C source code to include a Context const& ctx parameter to doPostPresetLoadSteps and its callees, and modified all functions called during preset loading to consult ctx.isCustomPreset instead of rto->isCustomPreset. This allows setOutModeHdBypass to call doPostPresetLoadSteps and force it to act like it's loading a fixed preset, even if you're applying a passthrough resolution.

With this change in place, saving and loading a passthrough preset seems to produce practically no change to registers. The only image-related registers I print (https://github.com/ramapcsx2/gbs-control/issues/461#issuecomment-1573231013) parts of the full register dump (serial d) to change are:

For posterity, here is the file containing the bad saved passthrough mode which loses sync (dunno if it still breaks with my load-time code changes): gbs-control.backup-1685073472398.bin.zip

Stale registers in fixed passthrough

Concerningly, I found that fixed (not custom) passthrough leaves many registers unchanged from the previous reboot (probably any call of setResetParameters()) or fixed scaling preset. I don't know if any of these stale registers (which get saved/loaded through custom preset slots) can affect the outcome sync stability, or they're harmless and the sync bug arises because doPostPresetLoadSteps wasn't called in fixed mode.

(The existing dumpRegisters() or 'd' prints more register ranges than my hexdump did, including OSD and VDS_PROC.)

Questions:

Fixing optimizeSogLevel without applying fixed passthrough twice

As mentioned at https://github.com/ramapcsx2/gbs-control/issues/461#issuecomment-1563895071, optimizeSogLevel fails the first time you apply fixed passthrough after a scaling preset.

nyanpasu64 commented 1 year ago

Bad news: with a custom passthrough preset, with a register dump indistinguishable from "reboot to passthrough", I got another hsync frequency excursion.

I suspect that I got three excursions this time (two in quick succession, followed by a third frequency jump while the PLL was dipping the frequency to fix the signal phase after the first two, which exposes the third harmonic).

Screenshot_20230602_182630

And by the same logic, my screenshot in https://github.com/ramapcsx2/gbs-control/issues/461#issuecomment-1572921651 actually looks like two excursions in quick succession.

output 2023-06-02 20:07:23 trimmed.zip Screenshot_20230602_200953

This file has two hsync loss events on the same frame. This screenshot shows the second one. It seems like in response to input vsync pulses, the TV5725 output some scrambled sync pulses, rather than coasting(?) when it couldn't properly analyze the input.

Is fixed passthrough susceptible? I don't know. Yes. I don't know if "reboot to passthrough" has the issue, or "reboot, scaling, setResetParameters, passthrough" or "reboot, scaling, passthrough".

nyanpasu64 commented 1 year ago

Looking at the PDF (3.5.2 Loop Filter, 3.5.3 Register Setting Rules), I think a possible issue is that PLLAD_ICP was set too high.

Screenshot_20230603_011110

The current code turns FS on, and the PDF agrees with this. Turning it off causes horizontal image instability.

Screenshot_20230603_121707

But setOutModeHdBypass with 480p input sets ICP is higher than the formula suggests (assuming M = PLLAD_MD + 1):

Screenshot_20230603_121759

I've tried editing the source code to change PLLAD_ICP from 5 to 4 (or 3), and I get a stable image. I've ran it at value 3 for an hour or so without image dropouts. However, an hour is not long enough to conclusively say that the issue has gone away (especially since I observed it would occur multiple times an hour on some play sessions, and never on other ones, possibly due to temperature?). And I want to take a break from binging TTYD...

ramapcsx2 commented 1 year ago

Well, yeah. :)

nyanpasu64 commented 1 year ago

Another possibility is that the issue showed up in passthrough testing, because I was routing the input luma signal through a Y-splitter (near my computer's >1kohm audio jack) to the GBS-C, to look for glitches in the input signal, and it also disturbed the luma sync signals. Then when I didn't find glitches in the input signal, I routed the VGA output blue wire (fed through an Extron VGA amp) into the audio jack, at the same time I decreased PLLAD_ICP and tried running further tests, and the routing changes stopped the sync glitches from happening as frequently.

I've left Homebrew Channel idling for 25 minutes or so, while capturing input luma (replicating the setup I had when I saw the most sync glitches), and haven't noticed a sync glitch so far. Fingers crossed.

EDIT: bad news, not fixed. Interestingly this time it misdetected part of the input signal as a vsync pulse, at the same time it spiked the hsync frequency. Do you have any ideas which registers handle this detection? (I'm again suspecting that the scaler is erroneously detecting extraneous negative sync pulses, or missing them altogether. I don't know if this happens by itself, or when exposed to I2C writes from the ESP8266.)

Is there a test register on sync separator, clock recovery, or input formatter state I can busy-poll in loop(), print, and look for the source of the problem?

(In any case I'm looking into building or buying a dumb transcoder.)

ramapcsx2 commented 1 year ago

It's not a good idea to introduce things like the splitter, when you want to debug such problems. They're already quite random and rare, so you'd want to avoid any more of these analog problem sources :p

PLLAD_ICP: PLL for the Analog to Digital section, current Charge Pump level. It controls the tightness of the PLL response, and has to be tuned for each situation it has to serve. I've tried tuning it based on what seemed to work best: If you set it too low, some artefacts start to appear. If it's too high, it looses all control. I've set it a few notches above the artifact level. Note: This has to be done for each source / operating mode. I think the source code tells you as much.

nyanpasu64 commented 1 year ago

I just got an incident of frequency spike and sync loss, at setAndUpdateSogLevel(14), PLLAD_ICP = 3, and no splitter in the signal path from Wii to GBS-C. Hopefully my hi-res capture card will show up over the next few days.

I looked up separ and seper in the programming guide and didn't find much useful. Can sync glitches be identified using interrupts (chapter 14)? Currently I don't see anything like dots or asterisks appear in the serial console, when the hsync frequency spikes or vsync is injected/missed. IDK if you're already listening for all relevant interrupts (meaning the chip doesn't generate an interrupt when it glitches out) or not (in which case I should check for what interrupt the chip self-reports).

Looking in the register definition, I find multiple SoG/csync width related registers by finding separ. I wonder if these registers are configured wrong, the TV5725 is seeing a spurious hsync pulse that shouldn't even be registered in the first place, my chip is damaged somehow, or the Wii itself is generating a bad signal.

In the meantime I may be researching playing games on Dolphin on LCD/VGA, or building/buying an alternative YPbPr-to-VGA transcoder. I hear that jamcoders (from wakabavideo) using the LMH1251MT-NOPB chip will rarely "drop sync on dark to light flashes", which is the sync bug everyone else is getting, unlike my random PLL screwup seemingly unaffected by image content or SoG capacitors.


EDIT: One possibility I considered was power delivery failure. I had difficulty probing the pads with a multimeter, and would get inconsistent voltage readings, but eventually I got a good reading of the GBS-C powered on with my usual USB charger, directly at the 5V barrel jack, and saw >4.9 volts. I'm guessing the issue I have isn't caused by a too-weak charger (though it could very well be caused by switching-mode noise or impulsive spikes making it to the sync processor).

ramapcsx2 commented 1 year ago

It can be anything or nothing you wonder about. That is the nature of this analog stuff. You just have to best-guess what could be the cause and try to fix it methodically: One fix attempt / change at a time, then watch whether anything improved.

nyanpasu64 commented 1 year ago

I did find that if I increase the SP_H_PULSE_IGNOR register, I get frequent hsync frequency errors (steps), as well as the frequency jumping and sliding back to the original frequency (randomly or upon vsync). I'm guessing this is another way to make the chip see missing or duplicate hsync pulses, which, and the register isn't the direct cause of the bug. I haven't fully investigated the other registers yet; for example I want to look into SP_L_DLT_REG, SP_DLT_REG, and really the entirety of the sync register block.

One possibility I considered was power delivery failure. I had difficulty probing the pads with a multimeter, and would get inconsistent voltage readings, but eventually I got a good reading of the GBS-C powered on with my usual USB charger, directly at the 5V barrel jack, and saw >4.9 volts. I'm guessing the issue I have isn't caused by a too-weak charger (though it could very well be caused by switching-mode noise or impulsive spikes making it to the sync processor).

I want to probe the GBS8200's power line with my PC audio jack, to look for noise. But I don't know how to connect it to the power line (either input 5V, or the reduced 3.3V or 1.8V lines), with a RC highpass filter to avoid exposing my audio jack to the full DC voltage. (I think my motherboard has its own series caps, but who knows.) I'd need some sort of tap PCB (like protoboard) connected somewhere on the GBS-C with wires or detachable connectors, going into AC coupling components, and hooked up to a RCA jack or cable to my computer. (The single RCA jacks I purchased from Amazon won't fit in protoboard holes, but I managed to crudely chop the pins lengthwise so they barely fit.)

Though now I'm dealing with a boot drive failure of my main desktop... on the day my PCIe capture card is set to arrive. So I have more pressing matters to deal with than debugging the GBS-C.

ramapcsx2 commented 1 year ago

Heh, oki, that is the right approach now :) Forget about the power thing. Power is going to be suitable, not likely to be an issue. The sync registers are though. They are important filters and detector settings, sync slicer and comparator stuff, that has to fit the source signal. As these are barely documented (usually just a name and a text blurp without context), it is difficult to configure..

nyanpasu64 commented 1 year ago

now I'm dealing with a boot drive failure of my main desktop...

fingers crossed it was a sleep-wake issue. I tried doing a BIOS update... which erased my boot order and overwrote my systemd-boot dual-boot menu with Windows Boot Manager. I had to fetch an Arch live USB to run efibootmgr to fix it... remember when they were called live CDs? does anyone use CDs anymore?

hoooooooooooooo boy

Screenshot_20230607_023754

The top channel is a 28 MHz capture (using a video capture card and CXADC drivers) of the Wii's output luma+sync signal fed to the GBS-C. The bottom 2 channels are a 192 kHz capture (using motherboard audio) of the GBS-C's output VGA blue line, as well as pseudo-csync.

My 28MHz capture revealed that before output hsync is lost, the input component signal gets completely scrambled by a burst of high-amplitude noise in the 8 MHz range (possibly aliased from a higher frequency). This was completely missed in my 192khz audio capture.

Screenshot_20230607_032944

Waveform slowed down by 1000x to 28 kHz: output 2023-06-07 02:33:50 trimmed.cxadc.zip

Any clue what's generating this loss of sync? My Wii? The GBS-C itself? The GBS-C's power supply?

I currently have the GBS-C's VGA output hooked up to a VGA switch, then through an Extron distribution amp (active 2-way splitter) to my CRT monitor and my motherboard's line in.

Perhaps as a temporary workaround I could try putting a ferrite bead around my video cables (all or luma?), and hopefully it won't smudge the video signal too much. I don't think it will address the root cause of the problem, unless it's caused by outside EMI interference.

ramapcsx2 commented 1 year ago

That 8MHz thing is surely a problem. I've no idea whether it actually comes from the Wii (maybe it can sometimes glitch?). Of course the PLL will loose lock on that, too :)

nyanpasu64 commented 1 year ago

I'm starting to suspect a power/analog issue on the GBS-C itself, backfeeding voltage interference into the luma line (and possibly others), and incidentally messing up input sync detection and the output signal as well.

When unplugging and plugging the power jack from the GBS-C, I notice little impulses of voltage making its way onto the Wii -> GBS-C luma signal. The first time I tried this, I also saw repeated 5.5 MHz oscillation blips, eerily reminiscent of the >8 MHz oscillations I see when the GBS-C loses sync during steady-state passthrough. Afterwards I tried two more times, but could not detect any interference of the same nature (it's possible it happened briefly and I didn't catch it while scanning the spectrogram, I'm not sure). I believe (but am not 100% certain) these come from the GBS-C during the power-on process.

(Note that all cxadc Audacity screenshots are slowed down 1000x, so I can zoom into signals further. So ms are us, and KHz frequencies are actually MHz.)

GBS-C power-on interference

Later, while probing the luma signal, I also got two sync drops and >8MHz oscillations on the luma line. I also believe (but am not 100% certain) these come from the GBS-C, randomly but possibly more frequently when there's more capacitance/extra mid-cable taps on the input luma line. I took a screenshot of the longer-lasting interference with more spectrum to capture:

Screenshot_20230611_004700

I decided to compare their spectra directly, in a section of the bad video signal with comparatively little high-frequency signals from the Wii to mask interference. The startup 5.5MHz interference is on top, and the steady-state >8MHz interference is below.

Screenshot_20230611_013412

Interestingly the startup interference seemed to be lower-amplitude and the blips were relatively evenly spaced, whereas the steady-state interference's amplitude increased over time and also became less frequent.


Separately I also noticed that when plugging in GameCube controllers into empty Wii ports, it resulted in individual blips on the luma line coming from the Wii. These break vsync for a single frame if they occur on a dark area of the image (and the negative part of the spike falls below 0 volts), but aren't recognized as sync pulses if they occur on light areas. They never occur in the large clusters associated with full hsync loss and the monitor going blank.


I still don't know where the clustered interference is coming from. Should I consider tapping the PCB's 3.3V analog power rails with my capture card, and watch for funny business (caused by bad power delivery or excessive/fluctuating draw)? (Though I'm concerned that when plugging/unplugging the GBS-C, the power rails will change quickly, and the voltage changes will pass through my capture card's AC-coupling cap and fry the voltage-sensitive ADC inside.)

The board is powered by 5V, but appears to be stepped down before powering the scaler. The TV5725 also has 3.3V and 1.8V digital power rails, which shouldn't affect incoming luma signals (but they shouldn't fluctuate either, and who knows how the chip malfunctions when fed the wrong/fluctuating voltages).

The various PDFs have a list of pins, with AP and AG for analog power and ground. I'll have to check if all the 3.3V AP pins (54, 60, 68 for RGB, 51 for ???, 48 for PLLAD sampling clock, 121 for PLL648 output/memory clocks) are shorted on the PCB or not.

nyanpasu64 commented 1 year ago

The analog power pins are shorted, either in the chip or on the PCB.

During GBS-C startup the 3.3V rail is mostly stable. In my testing so far, I did not encounter any video glitches, so was unable to measure the voltage rail during a glitch. After I capture 1-2 glitches, I may move my voltage probes to my next testing location.


How is the luma signal routed? So far my understanding of the luma circuit is:

Oddly this doesn't match the TV5725's docs, which do not call for resistors between the input pins and point A.

Screenshot_20230611_213635

How should I test this system?

It may be worth also measuring chroma (either at the input, or across/past the 200 ohm resistor) to see if it's experiencing the same oscillations when sync is lost.

nyanpasu64 commented 1 year ago

can't measure across 200 ohms without a common-mode blocking choke/transformer. maybe someday i'll learn how transmission lines work. i'm tired.

can't measure which way the interference is propagating down the luma line (to test whether it's the Wii or tv5725 at fault) without a second high-bandwidth capture channel. don't know whetoer to buy a second pcie card plugged into a nvme slot, or a real oscilloscope with time-synced capture channels. or maybe just stop trying.

for now, i'll try buying a cooling fan for the tv5725. fingers crossed. (in hindsight i should not have installed the clockgen on the heatsink.)

ramapcsx2 commented 1 year ago

Active clamping in operation, so you can't measure anything regarding impedance then. I would recommend you check out a second GBS board, second Wii, other pair of cables, etc :)

nyanpasu64 commented 1 year ago

video dropout occurred with active fan cooling

only thing left for me to do is to figure out if the interference is coming from the wii or the scaler, whether by subtracting measurements across the 200ohm resistor, or by measuring signal propagation down the cable

i could try dvd players in 480p component, but it's hard for me to sit through a movie

my main PC has started waking by itself in Windows. i suspected electromagnetic interference, but found out my computer woke twice without my wii losing signal once.

it could also be malfunctioning room light dimmer switches. i don't know. i'm putting ferrite beads onto luma when my order arrives.

ramapcsx2 commented 1 year ago

You're loosing focus of the easy to test things, and I really recommend you get another GBS :p

nyanpasu64 commented 1 year ago

Huh... if it turns out the second GBS is also bad I don't want to invest even more money into defective equipment, if the first GBS was bad then investing money into equipment with a high failure rate (judging from the issue tracker) doesn't inspire confidence, and if the Wii is bad I don't want to invest more money into scaling a defective console (I might run Dolphin instead of buying another Wii?), unless I can find a way to block the interference. At least oscilloscopes will be useful in the future I hope?

is it possible I damaged my first gbs in installation? i don't recall cooking any chips by soldering... but idk anymore


suspecting but not certain that i cooked my gbs-c by putting the clockgen on the heatsink and the board in a barely-vented plastic case, which runs it hot.

with the unvented case, I get temperatures up to 60 degrees C when shoving a thermocouple between the fins of the metal heatsink (with ambient 26.5 or so, possibly a bit below). the actual chip silicon may be hotter than 60 degrees, but also my thermocouple measures 101-102 degrees for boiling water, so may be overestimating temperatures slightly.

it's possible that if I got another gbs-c and actively cooled it from the get-go, it would not have these issues. i still feel somewhat uneasy about buying hardware which I've had these experiences with. If I know that other people can run in bypass mode without problems, I may try building another one (and unrelated, try to do the soldering better this time). the GBS-C is more full-featured than a jamcoder, but possibly runs hotter?

(i did not test if the same problems occur in scaling mode, and i'm not keen on spending hours on a Wii game at the moment. but if i do get problems on scaling mode, then the existence of people who don't get sync drops in scaling mode after replacing caps means that their units are fine, and hopefully my replacement will work fine too.)

nyanpasu64 commented 1 year ago

I really recommend you get another GBS :p

ok ok i'll bite. I've been trying to get Dolphin running, but noticed that no matter what OS and graphical settings I pick, it seems to have more latency than my Wii. So I might need a scaler/transcoder anyway, for testing and/or playing games. The current GBS works for brief testing, though I no longer want to use it for actual gaming. TBH I'll probably delay ordering a new GBS8200 until my scope arrives and I confirm the TV5725 is producing the video interference.

The previous GBS8200 I bought was from https://www.ebay.com/itm/284784624645. Is it possible that a different seller (eg. https://www.ebay.com/itm/111867300324) will have more or less issues? (Oddly both listings come from "Monroe Township, New Jersey, United States".) Or did I fry the board myself by running it in a hot enclosure, and it's unrelated to the place I'm ordering the board from?

It would be unfortunate if the new GBS8200 has the same issue, and I have no choice but to get a wakabavideo jamcoder.

nyanpasu64 commented 1 year ago

debugging with a scope

(edited down from Discord chat logs)

(photo of scope on my desk) a bit late to ask, but should i have gotten a siglent sds1202x-e instead of rigol ds1054z? i hear it's less channels but better bandwidth and software

most important thing is if it can record data for longer periods if it is and the bw doesnt bother you then meh 4 channels has its uses

https://cdn.discordapp.com/attachments/1083378131332763750/1120990591245889587/DS1Z_QuickPrint1.png these normal oscillations have a period of 10 ns, so 100mhz idk about the sync loss oscillations this noise happens continuously, but only rarely does it reach a high amplitude to drown out the input signal

https://cdn.discordapp.com/attachments/1118186176801685584/1120999271865516112/DS1Z_QuickPrint4.png

caught a likely sync loss event (though i had my crt off so i don't know) https://cdn.discordapp.com/attachments/1118186176801685584/1121013648350003270/DS1Z_QuickPrint22.png seems to be a 100mhz noise burst followed by lower-frequency ringing which I caught on cxadc, which repeats.

(photo of luma line and gbs-c sync input, separated by 200ohm resistor) it's odd to me the amplitude is just about identical on the two sides of the 200 ohm resistor, with the other side having 75 ohms to ground plus a 75ohm cable to the Wii is the noise not coming from the video/sync pin? then why does it not occur when i unplug the gbs-c's power jack? does the gbs-c have something that isn't past the 200 ohm resistor?

https://cdn.discordapp.com/attachments/1118186176801685584/1121015953883414528/DS1Z_QuickPrint24.png (photo of luma and chroma lines) luma and chroma noise appears correlated (the chroma is missing because this is vertical back porch after vsync)

I noticed something weird i always get noise bursts on output vsync begin and end, but they're small and insubstantial but when i plug my esp in, i also get random noise bursts all over the place unplug the esp and they disappear

(sees micro-usb cable between ESP and computer, unplugs from ESP, and the oscilloscope screen suddenly freezes) AAAAAAAA don't tell me it's the extra long micro-usb cable i'm using to connect to the computer

electromagnetic interference strikes again

i also get interference with the cable plugged into the ESP but not the computer. so the cable is acting as an antenna, not a ground loop.

Thoughts

I'm guessing the long USB cable was acting as an antenna and picking up ambient noise.

Why did I leave a USB cable connected to the ESP in the first place? In March I was hunting the flash corruption and reboot bug, and needed a cable connected to the GBS-C to get crash logs. Since a computer probably does not supply enough USB current to run the GBS-C, I cut the 5V line on the ESP8266's PCB, then found an extra-long micro-USB cable to run from the ESP to my computer several feet away (and coiled up part of the middle to reduce the extra length).

As a result, this cable is only connected to the GBS-C by ground (shield) and data lines, not the 5V line.


oddly, (IIRC with the USB cable disconnected) the scope still picked up two isolated voltage spikes on my scope from rolling my chair across the floor, and a longer ~90MHz voltage oscillation when I wasn't looking at the screen and don't remember what I was doing, followed by 2 voltage spikes from petting my cat. IDK what caused them and I can only hope they won't make my display cut out in-game.

Perhaps the GBS-C would benefit from a metal shielded case and ferrite beads on video/power cables?

ESP power cable interference

PXL_20230621_104512561

Seems that even with the long USB cable unplugged, I still get interference but of a lower amplitude.

Unplugging the ESP entirely makes this go away (aside from vsync pulses). Holding the ESP's power cable with my hands increases noise. Decreasing the area enclosed by the power cable pigtail (making it lie flat instead of standing up) decreases noise substantially. I suspect that even with the long USB cable removed, the power cable is acting as a ground loop?

If there's one primary source of interference, perhaps I could go transmitter-hunting using the orientation of the ESP's power loop to locate the direction to the interference source?

transmitter hunting

After unplugging most electronics, turning off my lights, unplugging my power strip with the flickering neon bulb (so I can only power my Wii and scope, not CRT), unplugging my electric toothbrush's induction charger, the noise persisted when my USB cable was plugged in.

I decided to plug a shorter USB cable coiled into a loop into the ESP, and aim it in various directions to find the source of interference. Sadly I did not find any direction with a decrease in noise.

I set the scope to probe the Cb channel and trigger off Priiloader's colored text on an otherwise B&W background. When I zoomed out to see several frames at a time, I noticed that the interference appeared stable but drifting leftwards (shorter period than Wii) at around 2.5-3 seconds per cycle, meaning the interference is periodic at around 59.94 + 0.4 Hz (unless one of my assumptions was inaccurate for some reason).

I noticed the periodicity only appears when leaving the GBS on for 30 seconds or so, and the interference is much larger in amplitude/intensity while the GBS-C is powering on (but practically disappears if I unplug the USB cable). This would indicate that the GBS-C itself is generating interference (in addition to EMI from house wiring and such), but it doesn't make its way back into the output video signal unless the ESP's USB cable is serving as an antenna.

If I switch back to the longer USB cable (a more sensitive antenna) and power-cycle the ESP but not the GBS8200's barrel jack, the louder interference reappears, either from the ESP booting or it reconfiguring the TV5725. I noticed it seems slightly louder if I position the cable coils over the TV5725's heatsink and clockgen, than the ESP8266. The interference continues with clockgen disabled. I did not try disabling frame sync, since I'm too tired to power off my Wii yet again so the GBS's Wi-Fi network isn't drowned out by passthrough interference and I can actually change settings.

When I tried probing the voltage between the end of the USB cable and the GBS-C's ground terminal (both connected and nominally held to ground), I found large voltage fluctuations, which coincided in timing and frequency with voltage noise in the output signal. The exact group delay and phase were different enough I couldn't easily compare if the signals were in phase or not.

I could not easily probe the two ends of the power cable between the GBS and ESP, to identify if there were voltage oscillations in the "ground" between these two boards.

Conclusion

My best guess as to the source of interference, is that USB cables (especially very long ones) connected to the ESP's micro-USB port are picking up EMI impulses from the GBS-C and surrounding environmental sources, and sending them into the GBS/ESP's ground plane, resulting in 100MHz-ish resonances. (I don't know why the USB cable only produces large ripples when connected to the ESP ground rather than GBS-C. Does the resonance rely on power sloshing between the USB cable and ESP over the power-delivery wire?) Sometimes the impulses are strong enough that the resonances overpower the video/sync signal, causing the GBS's ground plane and power regulation (signal clamping doesn't explain why the ringing was equally strong on both sides of the 200 ohm resistor) to oscillate at around 8 MHz afterwards.

And some parting thoughts:

fuck

Unfortunately the problem is not solved. Even after I successfully eliminated the majority of the low-level high-frequency noise, I still got an incident with extremely loud signals at 8-ish MHz overlaid onto the input video signal. I have not identified whether this comes from the GBS or Wii.

I managed to probe this on the scope, using a Pi Pico setup to recognize when the output hsync signal is running at the wrong frequency, and generate a trigger signal for the scope.

Unfortunately I cannot save images onto the flash drive since the 1054Z does not recognize either of two flash drives, both a 32GB one that used to work and a smaller 4GB one (people say smaller flash drives are better), both clean in fsck. If I tried to reboot the scope I'd lose the saved traces of this incident.

nyanpasu64 commented 1 year ago

Unfortunately I cannot save images onto the flash drive since the 1054Z does not recognize either of two flash drives, both a 32GB one that used to work and a smaller 4GB one (people say smaller flash drives are better), both clean in fsck. If I tried to reboot the scope I'd lose the saved traces of this incident.

IIRC I had taken that recording (wii sync drop.sr, not uploaded) when I fed my Wii's signal through a luma extension cord, with the intent of recording both sides of the luma cable, but I had not done so. That explains the missing CH2 in Waveform.csv (the source file), which was (at the time) recording the Wii end of the luma signal, but I'm not confident. I had probably coiled the luma cable extension so it took up less space, and the extension could end (at the GBS-C alongside chroma cables) near where it started (the luma cable which can't be separated more than a few inches from the chroma cables).

sync loss recording

since then i've taken a 250 Msps (4ns interval) capture of the wii's sync drops, and imported it into PulseView: wii sync drop 2.zip. the channels are:

  1. wii luma out
  2. gbs luma in
  3. gbs luma ground - wii luma ground
  4. trigger

based on this trace, i'm guessing about the cause of the sync loss... the wii does not take kindly to electromagnetic interference being backfed into the video outputs, and will sometimes emit 8mhz oscillations. How do I know the oscillations come from the Wii?

On the other hand, there's initial higher-frequency (and probably quite aliased) noise before the 8mhz oscillations. 250 Msps is just not enough to capture it, and I haven't yet ran a capture at 500 Msps with just 2 channels.

My theory is that the electromagnetic interference arises because my luma extension cable was coiled into 4 loops using over-under coiling, which minimizes cable strain but unfortunately acts as a loop antenna. And this cable was picking up noise (from an unknown source) and sending it as a common-mode signal between the Wii and GBS-C.

Screenshot_20230622_173810

Future

nyanpasu64 commented 1 year ago

the issue is not caused by the gbs. The wii still outputs noise bursts when connected to an unpowered gbs, or even to a dumb 76 ohm triple RCA terminator I made with the shields connected together (I didn't have 75 ohm resistors). I will continue to investigate to see where it's coming from, see if the problems occurs with my custom audio circuitry unplugged, or USB hard drive and GC controller, etc.

ramapcsx2 commented 1 year ago

Man.... this'll be one epic tale once you're through xD

nyanpasu64 commented 1 year ago

i've traced down the video noise to my wii itself

i get the noise from the wii's video output alone, with no controllers or hard drives plugged in, and just a SD card and a video output (no audio connection) leading to a triple 75ohm terminator and the scope probe. all but one scope probe was removed (so no extraneous ground paths, though the scope is grounded and my computer/monitors have many ground loops). guess it's time to move on to different hardware for playing games (as opposed to messing with Wii channels casually).

I was surprised the Wii was malfunctioning, because I've been using this (and another) Wii for years on various other LCD displays (and even my CRT TV for some time), without noticing any video dropouts, and nobody I've talked to has reported seeing Wiis with oscillation or sync loss on the video output. Perhaps my hardware is aging (or has overheated from RiiConnect24 standby) and started to degrade, or perhaps other displays' sync separators are more resilient to absurd voltage spike oscillations and sync loss on the SoG/luma signal.

Wii alternatives

unfortunately for playing GC/Wii games on PC, dolphin has inconsistent extra input latency with vsync on, because it schedules emulation timings based on wall-clock time (periodically sleeping the emulation thread to loosely match Wii timings) rather than monitor vsync. By probing in Tracy Profiler, I found that if you enable vsync, then the emulator will sometimes finish rendering frames too early relative to GPU scanout, then wait up to 12+ milliseconds for room on the swap chain (IDK how many frames long) to display it, then speedrun emulating the next frame (too early relative to GPU scanout) because it fell behind on wall-clock time while waiting for room on the swapchain. The Dolphin devs recommend VRR to minimize latency without tearing, but the real Wii doesn't need it for sub-Dolphin latency, and my CRT monitor can't do VRR without the image shifting/squeezing vertically.

I've also been suggested to buy a Switch lmfao

ramapcsx2 commented 1 year ago

I'd still try more Wiis and see whether they all produce this. Also it might matter whether homebrew is used, which might mess with the fully customizable DAC chip it uses for video.

nyanpasu64 commented 1 year ago

For the longest time this room has had a 3-way dimmer switch for its ceiling lights, which kept giving me trouble. When I turned the slider up, there was a point it would suddenly dim the lights and take a good chunk out of the LED bulbs' light output waveform (relative to 120hz rectified AC power). Additionally the ceiling lights would randomly flicker when set too bright.

2 days ago I finally got it replaced with a non-dimmer switch, which seems to have solved the light flickering. Inspecting the dimmer switch, I discovered it was an "incandescent dimmer" now hooked up to 120V LED bulbs, with burnt flux residue on the solder joints (whether from soldering or from overheating in use), blackened plastic around a giant metal coil (indicating overheating?), smaller capacitors under the PCB, and was hooked up to LED bulbs. In hindsight hooking an active incandescent dimmer to active LED bulbs with rectification and whatnot, and using it while it was this burnt, cannot have possibly been a good idea.

PXL_20230626_004050038

I also found a neon bulb which lights the switch when it's turned off (and possibly on? IDK.) Of course I generally run my CRT with the room lights off to avoid glare... The bulb electrodes look somewhat corroded, and it's possible either the giant coil, the neon bulb, or both were sporadically radiating EMI.

PXL_20230626_004950696

In the testing I've done since removing the bad wall switch, I have not seen one incident of noise and sync loss. Though yesterday I accidentally ran testing with my scope set to AC coupling on the "sync loss" signal, meaning I was unable to confirm my sync loss detector was actually outputting low and not high all the time (by force-triggering to view the current voltage level). And when I shut down my scope for the night, I dislodged the VGA connector, meaning it's possible the sync loss detector signal was high the whole time and I never noticed.

As a result I've been doing more testing today. Again I have not yet seen a sync loss in gameplay or using the scope to watch for the sync loss signal, although I am still running the scope for a few more hours to capture sync loss events if they recur. I'd cautiously call this issue fixed?

Though I do get "normal" sync loss when switching from Priiloader -> Homebrew Channel to System Menu for the first time, or when booting Super Mario Sunburn from Nintendont (even with Force 480p on, the "enable progressive scan?" dialog does not output a valid 480p signal, unlike some other games I've tested).

Conclusion

I currently believe that a defective room lighting dimmer switch was emitting sporadic bursts of loud electromagentic interference (most likely EM waves), either from the coil, the neon bulb, or wall/ceiling wiring. The lighting and Wii power are on separate circuit breakers, but I've heard it's possible the wires are routed in parallel and interference gets coupled between the two circuits. This electromagnetic radiation then interfered with the Wii and/or long cables in the room, causing the Wii's more sensitive circuitry to emit 8MHz bursts of interference out the luma cable, causing the GBS-C to lose sync. It may also have led to my desktop computer spontaneously waking up. Presumably the DVD player's component output was less sensitive to interference and didn't produce the same signal errors.

This would explain why the Wii never had problems prior to moving into this room. It does not explain why it worked fine on my CRT TV, or on the GBS-C before I switched to bypass mode (possibly those devices are less sensitive/obvious to sync loss, or perhaps my light switch was deteriorating and only recently started emitting radiation despite already flickering?). I still do not know why replugging the GBS-C caused the luma cable to have lower-pitched bursts of interference, and whether they were coming from the Wii or GBS-C.

EDIT: I've published a writeup at https://nyanpasu64.gitlab.io/blog/wii-gbs-c-sync-loss/.