apertus-open-source-cinema / naps

An experiment for building gateware for the axiom micro / beta using amaranth-hdl
https://apertus-open-source-cinema.github.io/naps/intro.html
GNU General Public License v3.0
39 stars 4 forks source link

Invalid settings in DDR OSERDESE2? #2

Closed widlarizer closed 4 years ago

widlarizer commented 4 years ago

https://github.com/apertus-open-source-cinema/nmigen-gateware/blob/4514d62f1b802f4f73e73311c5696dfb4118c246/src/cores/primitives/xilinx_s7/io.py#L34

This is invalid, and generates warnings in synthesis. However, it seems like this is copied from litevideo, which I assume is functional. Zynq-7000 technical reference manual refers to UG471, which defines in Table 3-8, that for given OQ and TQ, DATA_WIDTH and TRISTATE_WIDTH are only allowed to be 4. OSerdes10 is only used in the hdmi output core. Has that been tested to work? My Z7020 fails to be detected as a valid HDMI source with a slightly modified hdmi_test experiment, and hard-wired RGB values in the modified hdmi core. It's possible that Vivado overrides the values to be 4 and everything is fine, but considering memseting mmapped HDMI buffers in DDR RAM hangs my Zynq, I would think this might be a part of the problem... but maybe it's not, since it's a verbatim rewrite of the relevant source code in litevideo.

rroohhh commented 4 years ago

Yeah, we have seen that warning aswell, however we have been unable to find a setting that doesn't cause warnings or worse errors that stop synthesis.

hdmi_test.py has been tested on the axiom micro and beta where the signal goes through a PTN3363 HDMI/DVI level shifter and on a Zybo Z7 which has discrete CML termination. All of these worked with a normal monitor at 1080p60, however we have had problems with certain hardware like the camlink 4k, that has trouble receiving our signal. What board are you using again?

The Zynq hanging when accessing the mmaped HDMI buffers should not be connected at all to the OSERDES stuff, this sounds like you are either accessing the wrong memory region, or maybe you didn't tell the kernel to leave that memory region alone? (Using for example something like this u-boot command: fdt rsvmem add BASE_ADDR SIZE).

widlarizer commented 4 years ago

I spent some more time looking into it,, ran a much cleaner version of hdmi_test (with no success) and I think changing the tristate mode from DDR to BUF or SDR would solve the problem without changing behavior, since the tristate signals are actually unconnected.

I am using Pynq Z2, according to the schematic I have created this mapping for the on-board HDMI TX signals : platform script. The HDMI output connector is on-board and data lines have 150 Ohm pull-ups to 3.3V. I will re-test to check if my issues are caused by the enable on the 1k pull-down on HPD. I have ran examples with pre-made overlays in the pynq default image, including a demo with HDMI input, machine detection done by the PS7, and HDMI output, so the hardware is definitely functional. Note that the VHDL sources have OSERDESE2 tristate in SDR mode! I'll take a look at Xilinx' sources some more and see if I find inconsistencies with what I'm doing here.

You're right, the "DDR" mode threw me off, so I thought I was dealing with RAM write as well as readout, but that isn't the case.

wrong memory region

Not the case here, my buffer generator's base adress is left default, I stepped through the code at some point to verify that the first buffer is in fact placed at base address of the generator, 0x0f80_0000.

kernel not walled off

True, I didn't do that, I relied on it being a rather high address and hoping the kernel hasn't claimed it in the MPU or something, since I don't know how to get to u-boot console or to add u-boot commands to startup of an existing image. But wouldn't mmap fail rather than the memory access in that case? Building any of the linux images seems like a massive pain, what is your preferred method?

Let me know if I understand the hdmi_test correctly though: it should output valid video without any processor core service, right?

EDIT: I was looking at RX - the TX port has 50 Ohm pull-ups to 3V3 and HPD has only 10k to ground and another 10k to a FET's gate since it is an input

anuejn commented 4 years ago

Let me know if I understand the hdmi_test correctly though: it should output valid video without any processor core service, right?

Yup it should do exactly that... I guess that the problem that you are facing is clocking related. Can you verify (e.g. with a counter) that all clockdomains have the right frequency and that the fclock is setup right? When you use the "fatbitstream" infrastructure that should take care of the fclock if you dont use that you have to take care of that manually though.

rroohhh commented 4 years ago

But wouldn't mmap fail rather than the memory access in that case?

No pretty sure you can do whatever you want with the memory via /dev/mem atleast.

Building any of the linux images seems like a massive pain, what is your preferred method?

Hmm we build our own sdcard images using this, but that is nontrivial to setup. Is there no u-boot console coming up on the usb from the usb-uart chip? Also what are you booting from? If you are booting from a sd-card, there might be a file called uEnv.txt in one of the partitions where you can add your own u-boot commands.

widlarizer commented 4 years ago

I definitely did bypass the fatbitstream because I thought it was just some automation, and only ever ran the .bit emitted by vivado, and when I read a constant register with that, I thought it was "working", but that test wasn't dependent on clocks from PS7 that aren't enabled already from the base overlay. That's where I probably went wrong. Will re-test and keep you updated.

Also I got to uBoot over USB-UART interface now, the problem before was I was resetting the board not with the SRST button but with the power switch. so I couldn't launch console before uboot timed out since you can't open a serial port that the OS doesn't see. Right now, I'm running the vendor's default PetaLinux image with a default overlay loaded on boot.

widlarizer commented 4 years ago

So I can't run the fatbitstream because I don't have the same devicetree as axiom-firmware. I figured out an equivalent to setting fclk in the pynq sources though... it feels weird to be doing such low level things in Python, but here we are:

from pynq.ps import Clocks
Clocks.fclk0_mhz = 20
Clocks.fclk1_mhz = 99.999999

The setters enable the clocks as well, judging by this code ran when an Overlay (not bitstream) is downloaded:

if enable:
    Clocks.set_pl_clk(i, div0, div1)
else:
    Clocks.set_pl_clk(i)

I guess once I make this work, I'm going to turn all this into a non-axiom pynq-specific hook which does everything in Python.

My monitor still ignores the HDMI source. What's interesting is that once I set the clocks to 20 and 100 MHz respectively, I hear a two seconds long repeating coil whine pattern from the monitor, and this goes away when I set FCLK0 (pix_synth_fclk) for example to 10MHz. It is basically the same as the coil whine I hear when I manually tell the monitor to switch back to DisplayPort. So there is something going on, but it might just be a random power supply feedback resonance I should ignore. I am still running a clean version of this repo's code with minor modifications - the unused oserdes tristate mode set to SDR to suppress warnings, and since I was suspecting the monitor would fall asleep if the screen was all black:

m.d.comb += self.rgb.r.eq(0xFF)
m.d.comb += self.rgb.g.eq(0x44)
m.d.comb += self.rgb.b.eq(0x00)

in hdmi.py. I also am running the auto-generated devmem2 0xf8008000 w 0 etc AXI bus width fix thing.

Curiously, when I mmap myself a page of 0x2000 bytes, and try to read each word, I read my 0xDEADBEEF at 0x40000000 just fine, as well as the clock pattern 0x3E0, but once the program tries to read hdmi.clocking.mmcm.locked: 0x40000010[0:1], the machine freezes. I'm going to add counters to its clok (pix_synth_fclk) and see if it increments... is there anything that should be blocking me from accessing the mmcm CSRs other than its clock not running?

Should I test something else?

rroohhh commented 4 years ago

It hanging when reading some of the CSRs and not others is a bit strange. Reading the locked register should work even when the mmcm is not clocked, as the CSRs have their own clock domain / clock. One thing that could be happening is, that you are accessing the MMCM DRP port by accident, which could cause hangs if the MMCM DRP port is missing its clock signal. There have been some recent commits that should make AXI requests that don't complete after some amount of clock cycles timeout, so that could atleast help with debugging by preventing the complete lockup.

Are you setting the fclk for the HDMI stuff to the correct frequency? (The one printed in the log / set by the fatbitstream)

Finally what resolution are you trying? Maybe this is a SI problem and a lower resolution would help? (you can generate the modeline using cvt)

widlarizer commented 4 years ago

I wasn't accessing DRP, that was in the next address before the one I was accessing. I pulled the new commits, they broke fatbitstream generation, by the way (commit b9cae2e3678aa96ff9efd392b2036d2ebf747cba).

Yup, just checked. The values are as generated in fatbitstream, 20 and 100MHz.

Yes, I am using generate_modeline for 640x480@60, which I checked on my desktop to be in the EDID list of supported resolutions by my monitor.

I added a counter to my pix_domain in Hdmi:

__init__( # args ):
    in_pix_domain = DomainRenamer("pix")
    self.pix_counter = in_pix_domain(Counter(32))

and in Top:

self.pix_counter = pix_counter = ControlSignal(32)
m.d.comb += pix_counter.eq(m.submodules.hdmi.pix_counter.v)

Either it's not clocked and not counting, or it is an issue caused by the hierarchy flattening and namespace conflicts... Not sure how to bring "out" a structure like that, and if initializing a DomainRenamer is even fine in __init__.

rroohhh commented 4 years ago

If you have a pixel clock below 25 MHz you are supposed to do pixel-repetition (which we do not implement :) to bring it over 25 MHz again, so I would choose something slightly bigger than 640x480@60, atleast to check if that is not the problem.

Now the counter not working sounds bad. DomainRenamer works fine, even in __init__, so assuming the Counter module is implemented correctly this seems like the clocking is still broken.

Instead of the pix domain you could try the pix_synth_fclk domain, which is what is fed into the PLL that generates the pix and pix_5x domain.

Oh also, what pixclock error are you getting?

anuejn commented 4 years ago

I pulled the new commits, they broke fatbitstream generation, by the way (commit b9cae2e).

Ah sorry for that, I did some chaotic development without thinking about other people using that code. I will fix that asap :)

anuejn commented 4 years ago

Should work again now

widlarizer commented 4 years ago

Sorry for the delay, other school things caught up to me. Anyway, I changed the modeline to the auto generated 720p. At this point, the monitor gave me a warning about an unsupported resolution. I grabbed the monitor's preferred modeline from its EDID by connecting it to my laptop, input that, and got to see the searing orange color on the screen! Which is desired behavior, since I had modified hdmi.py trivially like this:

m.d.comb += self.rgb.r.eq(0xFF)
m.d.comb += self.rgb.g.eq(0x44)
m.d.comb += self.rgb.b.eq(0x00)

I'll retest with OSERDES configured as I suggested in the beginning, and probably will write a hook that replaces the operations on the missing device tree nodes on the Pynq Linux image. After this, I will go back to exposing a buffer to the PS7.

Curiously, the pix_counter is still stuck on 0xffffffff.

widlarizer commented 4 years ago

Turns out I've been using the fixed IO source the entire time. Here ya go, a two character PR: #3

rroohhh commented 4 years ago

I grabbed the monitor's preferred modeline from its EDID by connecting it to my laptop, input that, and got to see the searing orange color on the screen!

Very nice!

I'll retest with OSERDES configured as I suggested in the beginning, and probably will write a hook that replaces the operations on the missing device tree nodes on the Pynq Linux image.

Sounds good, you are welcome to upstream any Pynq support you do.

Curiously, the pix_counter is still stuck on 0xffffffff.

I don't know where the Counter module you were showing in a previous snippet is coming from, is it possible that the counter doesn't overflow to zero and thus simply gets stuck after ~a minute?

widlarizer commented 4 years ago

The warnings have been mitigated (and further discussion of my Pynq efforts is migrating to irc)