hoglet67 / RGBtoHDMI

Bare-metal Raspberry Pi project that provides pixel-perfect sampling of Retro Computer RGB/YUV video and conversion to HDMI
GNU General Public License v3.0
835 stars 112 forks source link

Can simple (non-CPLD) handle odd frequencies #277

Open e8micke opened 2 years ago

e8micke commented 2 years ago

Hi,

Anritsu spectum (MS8604A) analyzer (and some other equipment) have a "Separate video output" or VIdeo Output Separate". This was supposed to be connected to a video (signal) printer (UA455A from Nippon Aleph corp.), but could are five digital signals:

  1. ~21.04908MHz video clock
  2. ~56.4Hz VSync (normally high, with low pulses illustrated as red in the attached image)
  3. ~24.85kHz HSync (normally high, with low pulses illustrated as blue in the attached image)
  4. Video data (valid on falling video clock)
  5. A ~4ms low pulse when you push "print screen"

logic_analyzer_to_pic

Image was generated by Saleae logic analyzer and a Python/PIL script. Do you think it would be possible convert that signal to HDMI with or without CPLD? I do think I have understood an external XOR is needed on a non-CPLD solution.

IanSB commented 2 years ago

@e8micke

Do you think it would be possible convert that signal to HDMI with or without CPLD?

It should work with the CPLD but it may be problematic with the simple c0pperdragon style connection.

Simple mode is limited to about 16MHz pixel clock and to get any higher you have to de-serialise the incoming video and pass pixels in parallel as well as divide the clock by 2 and xor the syncs.

It would be a lot simpler to use the CPLD version but if you are interested there is a discussion of this type of solution used by the Atari ST mono mode: Starts here: https://github.com/c0pperdragon/Amiga-Digital-Video/issues/6#issuecomment-918614705 Continues here: https://github.com/c0pperdragon/Amiga-Digital-Video/issues/56

For the CPLD interface you would connect the Hsync & Vsync outputs to the appropriate sync inputs and the video to the green3 input

The genlock would make the output rate track the input by default which might upset some monitors as 56Hz is not one of the standard refresh rates but you can always disable the genlock and set the output to 60 Hz.

Here is a starting profile based on the above info but the width, height and offsets will need to be adjusted. Put this file in \Profiles\6-12_BIT_RGB on the SD card: Anritsu_spectrum_analyzer.txt

e8micke commented 2 years ago

Thanks for the fast and long answer! I think I will start to put some thoughts on the non-CPLD solution, for the fun of it... :-) All signals are 5V, so I will need some 74LVC anyway.

Is the "AtariST-Alpha3.zip" still untested to your knowledge, it seems to fit like a glove. Video clock (GPIO17 I think), which edge is used by the RPi Zero, is it programmable (rising or falling)? Is 4ms sufficient for a "keypress" (screensave) on SW1 input, or is there a debounce filter?

IanSB commented 2 years ago

@e8micke

I think I will start to put some thoughts on the non-CPLD solution, for the fun of it... :-)

Ok in that case the test profile above should be put into the /Profiles/Simple folder instead and you will need to select 1 Bit (R3 & G3) in the sampling menu.

Is the "AtariST-Alpha3.zip" still untested to your knowledge, it seems to fit like a glove.

I've not had any feedback on that and it's now quite old but the current stable release supports 1 Bit (R3 & G3) so I suggest you use that as you don't need support for the other bit depths.

Video clock (GPIO17 I think), which edge is used by the RPi Zero, is it programmable (rising or falling)?

It uses both edges so you have to divide the video clock by 4. Make sure both edges of the new clock (psync) change state with the falling edge of the video clock.

I think it's divide by 4 because you have to divide by 2 to get the basic psync signal for the 21Mhz clock (due to it using both edges) and divide by 2 again because you are clocking in 2 pixels at a time.

Is 4ms sufficient for a "keypress" (screensave) on SW1 input, or is there a debounce filter?

The keys are only checked at the start of each frame so the pulse would have to be significantly greater than 20ms for reliability.

e8micke commented 2 years ago

Took a while to get the parts and have the time to put it together, but finally it happened :-) I works sort-of, but there are some kind weird syncronization/data error. capture11

Pixel clock is as earlier mentioned about 21049080Hz. But after the division by 4 the incoming clock frequency is about ~5.3MHz (into PiCLK/GPIO17). What Frequency is correct to state in the Profile file, is it really 21049080? I could imagine that the vertical sync could jump in even steps, but it seems like even odd steps is possible, which feels a bit wierd due to the 2-bits per clock data shuffling. My best bet is that the PLL does not get proper information for it's lock. Detected polarity state = 4, Comp (Separate H & V CPLD) clkinfo.clock = 21049080 Hz clkinfo.line_len = 848.000000 clkinfo.clock_ppm = 5000 ppm Nominal 100 lines = 4028600 ns Actual 100 lines = 4028761 ns Clock error = 39 PPM Error adjusted clock = 21048238 Hz Target PLL frequency = 2020630912 Hz, prediv = 1, PER = 4 Actual PLL frequency = 2020630912 Hz GPCLK Divisor = 4 Lines per frame = 440, (440) Actual frame time = 17726379 ns (non-interlaced), line time = 40287 ns Window: H=40086 to 40488, V=17637748 to 17815010 Sync=Comp, Det-Sync=Comp, Det-HS-Width=37211, HS-Thresh=9000 Width or Height differ from last FB: Setting dummy 64x64 framebuffer Overscan L=0, R=0, T=0, B=0 Initialised Framebuffer Size: 800x480 (req 800x480). Addr: 1E000000 (DE000000) Screen size = 800x480 Pitch=800, width=800, height=480, sizex2=0, bpp=8 chars=96, nlines=416, hoffset=6, voffset=19, ncapture=-1 palctrl=0, samplewidth=2, hadjust=16, vadjust=32, sync=0x4 detsync=0x4, vsync=0, video=0, ntsc=8, border=0, delay=2 Display startup message *PLL *PLL Locked

IanSB commented 2 years ago

@e8micke It's not the PLL as that isn't used in simple capture mode. (The PLL generates a clock to feed to the CPLD but that isn't connected in simple mode)

The clock error is very low which means the profile is set up correctly so the problem might be some issue with the simple two bit capture code as that hasn't really been tested other than some promising results from the Atari experiments but the most likely cause is some issue with your capture circuit.

The image is the right width so the absolute value of the divided sample clock is correct but the most likely cause is some sort of phase issue with the sample clock divider as the start state of the two bits of the flip flops would be random. You may need to reset the sample clock divider at the start of each line using the hsync pulse. Even then you would have to be careful to get the relative phases of everything correct.

Do you have a schematic you can post or link to?

e8micke commented 2 years ago

Here are a schematic: anritsu_adapter.pdf

The physical circuit is a "rats-nets" type (not a PCB), so there are many error sources (grounding, inductance in wires, etc) . (image has difference to the schematic): image

As the pixels per row are even I don't think I need to resync the 2bits in series to two bits in parallel circuit.

The error have seen in not (purely) shifting errors, but will connect a logic analyzer after all latches and logic and check that the signal into the Raspberry Pi is as expected, and render the image, once more in Python+PIL.

e8micke commented 2 years ago

It seems to be partly related to the "rats-nest", I get a very similar image using the logic analyzer.

However: I get the impression that the e.g. level of G3 is not strictly sampled at the edge. It might happen 12ns from the edge. There are some glitches which are visible via RGBtoHDMI, but are ignored by the logic analyzer. image I need to have one more flip-flop to make sure the data is stable.

IanSB commented 2 years ago

@e8micke

I get the impression that the e.g. level of G3 is not strictly sampled at the edge. It might happen 12ns from the edge. There are some glitches which are visible via RGBtoHDMI, but are ignored by the logic analyzer

That is very likely because it takes the GPU ~30ns to read the GPIOs which means that worst case it would be 30ns after the clock edge and an average of 15ns which corresponds to your estimate.

IanSB commented 2 years ago

@e8micke Also LVC parts are probably too susceptible to noise in such a rats nest. You could try VHC parts instead which are slower (similar speed to HC but 5v tolerant.)

e8micke commented 2 years ago

Very close to OK now: (cropped captured using RGBtoHDMI) image

As a note, I have seen "R3 and G3 (R3 is leftmost)" as a comment, but I think G3 (GPIO9) is leftmost (is presented as the left pixels of the two). You have mentioned that there are 30ns max from edge to read, but can it happen earlier than the edge? (due to some kind of phase lock software) I currently use shift register that only gives stable data for ~44ns.

IanSB commented 2 years ago

@e8micke

Very close to OK now: (cropped captured using RGBtoHDMI)

Yes looking much better. You need to increase the geometry sizes to avoid cropping

can it happen earlier than the edge? (due to some kind of phase lock software)

No, all the GPIOs are all read into a single register at the same time in a loop and if the clock bit has changed state then the data bits already read at the same time as the clock bit are used.

e8micke commented 2 years ago

I think I have managed to replicate the way your software sample the signal, and it looks perfect in the logic analyzer (when converted to an image), and I can't see glitches or jitter using oscilloscope either.

But I can't remove the (small) jitter errors. I get the impression that the Raspberry is simply not (always) quick enough to fetch GPIO before the logic level changes.

You mentioned 30ns as max time for fetching after clock edge change. I think I have ~40ns stable data after clock edge at the moment.

Is that (30ns max) independent of the input signal, like video clock frequency? And is it truly a max value? I starting to think that the data needs to be stable just as long as the clock signal (~88ns).

IanSB commented 2 years ago

Is that (30ns max) independent of the input signal, like video clock frequency? And is it truly a max value?

The benchmark code indicates it takes 36ns which is very close to your ~40ns stable

I starting to think that the data needs to be stable just as long as the clock signal (~88ns).

It probably does.

e8micke commented 2 years ago

Noted! I thought it was some kind of "hardware acceleration" (DMA) involved, but now I understand it is purely "bit-banging" on steroids. :-) I will do a redesign of latches and logic, thanks Ian!