libretro / common-shaders

Collection of commonly used Cg shaders. These shaders are usable by either HLSL and/or Cg runtime compilers. The cg2glsl script will translate most of these into GLSL shaders.
http://www.libretro.com
1.05k stars 253 forks source link

================================ -- Common shaders --

This is a package of pixel shaders intended for old school emulators. Copyrights are held by the respective authors. The shaders are coded in the Cg language, suitable for both OpenGL and D3D. Most of these shaders are converted from other languages (GLSL, FX, HLSL, etc).

The shaders all follow a convention which must be followed by the implementer of an OGL/D3D backend.

Known implementations of this spec are currently:

Entry points: Vertex: main_vertex Fragment: main_fragment

Texture unit: All shaders work on texture unit 0 (the default). 2D textures must be used. Power-of-two sized textures are recommended for optimal visual quality. The shaders must deal with the actual picture data not filling out the entire texture. Incoming texture coordinates and uniforms provide this information.

The texture coordinate origin is defined to be top-left oriented, i.e. a texture coordinate of (0, 0) will always refer to the top-left pixel of the visible frame. This is opposite of what most graphical APIs expect. The implementation must always ensure that this ordering is held for any texture that the shader has access to.

Every texture bound for a shader must have black border mode set. I.e. sampling a texel outside the given texture coordinates must always return a pixel with RGBA values (0, 0, 0, 0).

Uniforms: Some parameters will need to be passed to all shaders, both vertex and fragment program. A generic entry point for fragment shader will look like:

float4 main_fragment (float2 tex : TEXCOORD0, uniform input IN, uniform sampler2D s_p : TEXUNIT0) : COLOR {}

The input is a struct looking like: struct input { float2 video_size; float2 texture_size; float2 output_size; float frame_count; float frame_direction; };

TEXCOORD0: Texture coordinates for the current input frame will be passed in TEXCOORD0. (TEXCOORD is a valid alias for TEXCOORD0).

COLOR0: Although legal, no data of interest is passed here. You cannot assume anything about data in this stream.

IN.video_size: The size of the actual video data in the texture, e.g for a SNES this will be generally (256, 224) for normal resolution frames.

IN.texture_size: This is the size of the texture itself. Optimally power-of-two sized.

IN.output_size: The size of the video output. This is the size of the viewport shown on screen.

IN.frame_count: A counter of the frame number. This increases with 1 every frame. This value is really an integer, but needs to be float for CGs lack of integer uniforms.

IN.frame_direction: A number telling which direction the frames are flowing. For regular playing, this value should be 1.0. While the game is rewinding, this value should be -1.0.

modelViewProj: This uniform needs to be set in vertex shader. It is a uniform for the current MVP transform.

Pre-filtering: Most of these shaders are intended to be used with a non-filtered input. Nearest-neighbor filtering on the textures themselves are preferred. Some shaders, like scanline will most likely prefer bilinear texture filtering.

Genres: There are several different types of shaders available, some of the relevant types are sorted into folders.

2x-classic: These are the typical, classic filters usually run on CPU, such as HQ2x, 2xSaI, SuperEagle, etc, converted into shaders.

Blur: Shaders focusing on bluring the output image.

Enhance: Shaders focusing on enhancing the image quality through other means than bluring only.

TV: Shaders focusing on replicating the visual image of a game running on a CRT screen.

Misc: Shaders that do not directly fit into any of the above categories.

Meta: Contains meta-shaders *.cgp.

============================= -- Cg meta-shader format --

Rationale: The .cg files themselves contain no metadata necessary to perform advanced filtering. They also cannot process an effect in multiple passes, which is necessary for some effects. The CgFX format does exist, but it would need current shaders to be rewritten to a HLSL-esque format. It also suffers a problem mentioned below.

Rather than putting everything into one file (XML shader format), this format is config file based. This greatly helps testing shader combinations as there is no need to rearrange code in one big file. Another plus with this approach is that a large library of .cg files can be used to combine many shaders without needing to redundantly copy code over. It also helps testing as it is possible to unit-test every pass separately completely seamless.

Format:

The meta-shader format is based around the idea of a config file with the format: key = value. Values with spaces need to be wrapped in quotes: key = "value stuff". No .ini sections or similar are allowed. Meta-shaders may include comments, prefixed by the "#" character, both on their own in an otherwise empty line or at the end of a key = value pair.

The meta-format has four purposes:

Parameters:

Multi-pass uniforms:

During multi-pass rendering, some additional uniforms are available.

With multi-pass rendering, it is possible to utilize the resulting output for every pass that came before it, including the unfiltered input. This allows for an additive approach to shading rather than serial style.

The unfiltered input can be found in the ORIG struct:

uniform sampler2D ORIG.texture: Texture handle. Must not be set to a predefined texture unit. uniform float2 ORIG.video_size: The video size of original frame. uniform float2 ORIG.texture_size: The texture size of original frame. in float2 ORIG.tex_coord: An attribute input holding the texture coordinates of original frame.

PASS%u: This struct holds the same data as the ORIG struct, although the result of passes {1, 2, 3 ...}, i.e. PASS1.texture holds the result of the first shader pass. If rendering pass N, passes {1, ..., N-2} are available. (N-1 being input in the regular IN structure).

PREV: This struct holds the same data as the ORIG struct, and corresponds to the raw input image from the previous frame. Useful for motion blur.

PREV1..6: Similar struct as PREV, but holds the data for passes further back in time. PREV1 is the frame before PREV, PREV2 the frame before that again, and so on. This allows up to 8-tap motion blur.

For backend implementers:

-- Rendering the shader chain: --

With all these options, the rendering pipeline can become somewhat complex. The meta-shader format greatly utilizes the possibility of offscreen rendering to achieve its effects. In OpenGL usually this is referred to as frame-buffer objects, and in HLSL as render targets (?). This feature will be referred to as FBO from here. FBO texture is assumed to be a texture bound to the FBO.

As long as the visual result is approximately identical, the implementation does not have to employ FBO.

With multiple passes our chain looks like this conceptually:

|Source image| ---> |Shader 0| ---> |FBO 0| ---> |Shader 1| ---> |FBO 1| ---> |Shader 2| ---> (Back buffer)

In the case that Shader 2 has set some scaling params, we need to first render to an FBO before stretching it to the back buffer.

|Source image| ---> ... |Shader 2| ---> |FBO 2| ---> (Back buffer)

Scaling parameters determine the sizes of the FBOs. For visual fidelity it is recommended that power-of-two sized textures are bound to them. This is due to floating point inaccuracies that become far more apparent when not using power-of-two textures. If the absolute maximum size of the source image is known, then it is possible to preallocate the FBOs.

Do note that the size of FBOn is determined by dimensions of FBOn-1 when "source" scale is used, not the source image size! Of course, FBO0 would use source image size, as there is no FBO-1 ;)

I.e., with SNES there is a maximum width of 512 and height of 478. If a source relative scale of 3.0x is desired for first pass, it is thus safe to allocate a FBO with size of 2048x2048. However, most frames will just use a tiny fraction of this texture.

With "viewport" scale it might be necessary to reallocate the FBO in run-time if the user resizes the window.