MikeS11 / MiSTerFPGA_YC_Encoder

All work releated to the YC / NTSC & PAL Encoder for MiSTerFPGA
MIT License
33 stars 4 forks source link

Reduce module footprint and optimize resolution #4

Open jotego opened 5 months ago

jotego commented 5 months ago

Thank you for designing this module. There is some room for improvement:

Only encode one quarter of the sine wave

The sine wave is a function that repeats itself four times. Yamaha's FM chips only encode the first quarter of the wave, with good resolution. Then you can either flip the index to the table or change the output sign to recover the other three portions. For the same memory footprint, you can get 4x the resolution.

Higher resolution shall decrease color noise coming from quantization noise in the subcarrier.

Read and store the sine table indepedently

Instead of accessing the sine LUT three times in the code with different indexes, consider sampling it continuously in an independent always block. That block would sequentially smaple the three signals you need, the burst, the sine and the cosine and store the values in registers. Then you access those registers in your multiplication logic.

This will ensure that a single LUT is needed, rather than three copies of it. It also reduces timing constraints because you do not need to access the LUT and wait for the data, as it is already latched.

Having said that, Quartus synthesizer seems to do a good work at using a dual port BRAM block for this. So it is not generating copies of the LUT. Other synthesizers may generate them, though.

The timing benefit can only be obtained by moving the LUT access out of the multiplication logic, though. There is no gain in having it inside, so I would move it out.

Add an input video filter

Before modulating up the video signal, it is worth attenuating the aliasing that you are generating by resampling it at the video clok. When you use, say 100MHz, to sample the video signals, you are implicitly admiting that a transition of a pixel from white to black occurs in 10ns. That is a huge bandwidth that will generate aliasing. Add a 1.5MHz low pass filter before multiplying.

Add an output filter

Before letting this go through the FPGA pins, add another 5MHz low pass filter. This is the Nyquist filter the external ADC needs.

Explore other carrier signals

You may also consider throwing away the sine LUT altogether and just modulate using a square subcarrier. Once you have the filters added, this may work surprisingly well.

MikeS11 commented 5 months ago

Hey! I can have a look at this, but the main reason to keep filtering out was to keep the module as slim as possible to not add too many additional resource constraints on the framework. That said, having a 1/4 sine wave and just flipping the values I guess would work to keep down the LUT lookup but as for resolution, that would be highly limited by the input frequency which is set by the core itself.

A 28Mhz core will have a much choppier video signal than a 96Mhz core.

BUt this sounds like a fun project I can look into :)

jotego commented 5 months ago

You're right about the resolution indeed: the video clock limits how much you can sample the sine table.

I wonder if adding the filters could remove the need for so much tweaking with the frequencies for each core. The filters are even more important as the subcarrier frequency will never be that far apart from the video clock.

Using a square wave instead of the sine LUT is promising. Note that the output filter will automatically transform the square wave into a sine wave as it will filter out all harmonics except the first one. That means that you still get a sine waveform for the color burst without using a sine LUT.