Green blank output when using core.placebo.Shader

dexeonify commented 1 year ago

When I try to use any shaders with core.placebo.Shader(), I get a green blank output in VapourSynth-Editor or when piped to mpv. When I pipe to mpv, I get this error repeatedly printed out in the terminal:

Validation failed: !params->renderable || fmt_caps & PL_FMT_CAP_RENDERABLE (../../../../../src_packages/libplacebo/src/gpu.c:234)
  for texture: ../../../../../src_packages/vs-placebo/src/shader.c:122

I've done a few checks myself:

I'm using the latest vs-placebo release version (1.4.4)
I'm using the latest VapourSynth version (R62)
ffmpeg's libplacebo works, hardware incompatibilities shouldn't be an issue
Both lsmas and ffms2 doesn't change the output
core.placebo.Deband() works, so vs-placebo is definitely loaded correctly
Several shaders are tested, including shaders that don't require scaling like adaptive-sharpen
I've converted the input video to YUV444Px as requested before passing it off to the shader function

I'm out of ideas.

Environment

CPU: Intel Core i7-9700 RAM: 8GB GPU: Intel UHD Graphics 630, No external GPUs OS: Windows 11 Build 22621.1778

Minimal Working Example

from vapoursynth import core
import vapoursynth as vs

video = core.lsmas.LWLibavSource(source="video.mkv")
video = core.resize.Point(clip=video, format=vs.YUV444P16)
shader = core.placebo.Shader(clip=video, shader="FSR.glsl", width=1920, height=1080)
#shader = core.placebo.Deband(clip=video, iterations=2, threshold=35)
video = core.resize.Point(clip=shader, format=vs.YUV420P8)
video.set_output()

quietvoid commented 1 year ago

Sounds like a possible difference in libplacebo version, where the older one in vs-placebo causes issues. I assume you used a recent FFmpeg build?

It could help if you provide the debug log by adding log_level=5 to the placebo.Shader call.

quietvoid commented 1 year ago

In the meantime I started a build to test with the latest libplacebo here: https://github.com/quietvoid/mpv-winbuild-cmake/actions/runs/5163637750 When it's ready it'll output the updated vs-placebo DLL.

dexeonify commented 1 year ago

I assume you used a recent FFmpeg build?

Yes, I usually use the ffmpeg git. However, libplacebo also works for me on the release version of ffmpeg - v6.0 (which is 2023-03-04, according to Gyan.dev).

Contrary to the guides I found online, I do not need to specify hwupload and hwdownload for libplacebo to work. In fact, these two parameters caused libplacebo to fail for me.

It could help if you provide the debug log by adding log_level=5 to the placebo.Shader call.

There doesn't seem to be additional logs? For reference, I'm getting the logs using this command line:

vspipe -c y4m placebo.vpy - | mpv -

quietvoid commented 1 year ago

There doesn't seem to be additional logs? For reference, I'm getting the logs using this command line:

vspipe hides stdout logs. You can try adding video.get_frame(0) to your script and just running it with Python.

I found this which seems like a similar issue to what you're seeing: https://github.com/haasn/libplacebo/issues/172#issuecomment-1567589476 So I'm going to guess the updated libplacebo might fix your issue. Or maybe it is an Intel driver problem.

quietvoid commented 1 year ago

Can you try this: https://github.com/quietvoid/mpv-winbuild-cmake/suites/13347837213/artifacts/729911053 It's vs-placebo built on latest Vulkan/libplacebo: https://github.com/quietvoid/mpv-winbuild-cmake/actions/runs/5163637750

dexeonify commented 1 year ago

Unfortunately, no. The silver lining is the backtrace log is more verbose.

Log

``` Initialized libplacebo v5.264.0-277-ge684619 (API v278) No VkInstance provided, creating one... Available instance version: 1.3.252 Available layers: VK_LAYER_OBS_HOOK (v1.3.216) Available instance extensions: VK_KHR_surface VK_KHR_win32_surface VK_KHR_external_memory_capabilities VK_KHR_external_semaphore_capabilities VK_KHR_external_fence_capabilities VK_KHR_get_physical_device_properties2 VK_KHR_get_surface_capabilities2 VK_KHR_device_group_creation VK_EXT_swapchain_colorspace VK_EXT_debug_report VK_EXT_debug_utils VK_KHR_portability_enumeration VK_LUNARG_direct_driver_loading Creating vulkan instance with extensions: VK_KHR_get_physical_device_properties2 VK_KHR_surface VK_EXT_swapchain_colorspace VK_KHR_external_memory_capabilities VK_KHR_external_semaphore_capabilities VK_KHR_get_surface_capabilities2 VK_KHR_portability_enumeration Probing for vulkan devices: GPU 0: Intel(R) UHD Graphics 630 v1.3.215 (integrated) uuid: 86:80:98:3E:02:00:00:00:00:00:00:00:00:00:00:00 Vulkan device properties: Device Name: Intel(R) UHD Graphics 630 Device ID: 8086:3e98 Device UUID: 86:80:98:3E:02:00:00:00:00:00:00:00:00:00:00:00 Driver version: 194842 API version: 1.3.215 Queue families supported by device: 0: flags 0xf num 1 Using graphics queue 0 Available device extensions: VK_EXT_full_screen_exclusive VK_KHR_swapchain VK_KHR_external_memory VK_KHR_external_memory_win32 VK_EXT_external_memory_host VK_KHR_external_semaphore VK_KHR_external_semaphore_win32 VK_KHR_external_fence VK_KHR_external_fence_win32 VK_KHR_timeline_semaphore VK_KHR_win32_keyed_mutex VK_KHR_get_memory_requirements2 VK_KHR_bind_memory2 VK_KHR_dedicated_allocation VK_KHR_sampler_mirror_clamp_to_edge VK_KHR_maintenance1 VK_KHR_maintenance2 VK_KHR_maintenance3 VK_KHR_maintenance4 VK_KHR_synchronization2 VK_KHR_shader_draw_parameters VK_KHR_push_descriptor VK_KHR_descriptor_update_template VK_KHR_multiview VK_KHR_shader_float16_int8 VK_KHR_shader_float_controls VK_KHR_16bit_storage VK_KHR_8bit_storage VK_EXT_shader_subgroup_ballot VK_EXT_shader_subgroup_vote VK_KHR_storage_buffer_storage_class VK_KHR_variable_pointers VK_KHR_relaxed_block_layout VK_EXT_sampler_filter_minmax VK_KHR_device_group VK_EXT_ycbcr_2plane_444_formats VK_EXT_4444_formats VK_EXT_post_depth_coverage VK_EXT_shader_viewport_index_layer VK_EXT_shader_stencil_export VK_EXT_conservative_rasterization VK_EXT_sample_locations VK_KHR_draw_indirect_count VK_EXT_multi_draw VK_KHR_image_format_list VK_EXT_vertex_attribute_divisor VK_EXT_descriptor_indexing VK_EXT_inline_uniform_block VK_KHR_create_renderpass2 VK_KHR_dynamic_rendering VK_KHR_swapchain_mutable_format VK_KHR_depth_stencil_resolve VK_KHR_driver_properties VK_KHR_vulkan_memory_model VK_EXT_conditional_rendering VK_EXT_hdr_metadata VK_EXT_depth_clip_enable VK_EXT_depth_clip_control VK_EXT_scalar_block_layout VK_KHR_imageless_framebuffer VK_KHR_buffer_device_address VK_EXT_buffer_device_address VK_EXT_host_query_reset VK_KHR_performance_query VK_NV_device_diagnostic_checkpoints VK_KHR_separate_depth_stencil_layouts VK_KHR_shader_clock VK_KHR_spirv_1_4 VK_KHR_uniform_buffer_standard_layout VK_EXT_separate_stencil_usage VK_EXT_fragment_shader_interlock VK_EXT_index_type_uint8 VK_EXT_primitive_topology_list_restart VK_KHR_shader_subgroup_extended_types VK_EXT_line_rasterization VK_EXT_memory_budget VK_EXT_memory_priority VK_EXT_texel_buffer_alignment VK_INTEL_performance_query VK_EXT_subgroup_size_control VK_EXT_shader_demote_to_helper_invocation VK_EXT_pipeline_creation_feedback VK_EXT_pipeline_creation_cache_control VK_KHR_pipeline_executable_properties VK_EXT_transform_feedback VK_EXT_provoking_vertex VK_EXT_extended_dynamic_state VK_EXT_extended_dynamic_state2 VK_EXT_vertex_input_dynamic_state VK_EXT_custom_border_color VK_EXT_robustness2 VK_EXT_image_robustness VK_EXT_calibrated_timestamps VK_KHR_shader_integer_dot_product VK_KHR_shader_subgroup_uniform_control_flow VK_KHR_shader_terminate_invocation VK_KHR_workgroup_memory_explicit_layout VK_EXT_shader_atomic_float VK_KHR_copy_commands2 VK_KHR_shader_non_semantic_info VK_KHR_zero_initialize_workgroup_memory VK_EXT_shader_atomic_float2 VK_EXT_global_priority VK_EXT_global_priority_query VK_KHR_global_priority VK_KHR_format_feature_flags2 VK_EXT_color_write_enable VK_EXT_private_data VK_EXT_image_2d_view_of_3d Creating vulkan device with extensions: VK_KHR_swapchain VK_KHR_push_descriptor VK_KHR_external_memory_win32 VK_EXT_external_memory_host VK_KHR_external_semaphore_win32 VK_EXT_hdr_metadata VK_EXT_full_screen_exclusive Memory heaps supported by device: 0: flags 0x1 size 3986M Memory types supported by device: 0: flags 0x1 heap 0 1: flags 0x7 heap 0 2: flags 0xf heap 0 Memory summary: 0 used 0 res 0 alloc, efficiency 100.00%, utilization 100.00%, max page: 249M shaderc SPIR-V version 1.6 rev 1 Initialized SPIR-V compiler 'shaderc' Handle type VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT (0x10) is not exportable Handle type VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT (0x4) is not exportable Handle type VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT (0x4) is not importable Handle type VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT (0x10) is not exportable Tex caps for VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT (0x4) unsupported: VK_ERROR_FORMAT_NOT_SUPPORTED Tex caps for VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT (0x4) unsupported: VK_ERROR_FORMAT_NOT_SUPPORTED Minimum texel alignment: 48 GPU information: GLSL version: 450 (vulkan) max_shmem_size: 32768 max_group_threads: 1024 max_group_size[0]: 1024 max_group_size[1]: 1024 max_group_size[2]: 64 subgroup_size: 32 min_gather_offset: -32 max_gather_offset: 31 Limits: thread_safe: 1 callbacks: 1 max_buf_size: 4180641792 max_ubo_size: 134217724 max_ssbo_size: 1073741820 max_vbo_size: 4180641792 max_mapped_size: 4180641792 max_buffer_texels: 134217728 align_host_ptr: 4096 host_cached: 1 max_tex_1d_dim: 16384 max_tex_2d_dim: 16384 max_tex_3d_dim: 2048 blittable_1d_3d: 1 buf_transfer: 1 align_tex_xfer_pitch: 64 align_tex_xfer_offset: 64 max_variable_comps: 0 max_constants: 18446744073709551615 max_pushc_size: 256 align_vertex_stride: 1 max_dispatch[0]: 65536 max_dispatch[1]: 65536 max_dispatch[2]: 65536 fragment_queues: 1 compute_queues: 1 External API interop: UUID: 86:80:98:3E:02:00:00:00:00:00:00:00:00:00:00:00 PCI: 0000:00:00:0 buf export caps: 0x2 buf import caps: 0x12 tex export caps: 0x2 tex import caps: 0x12 sync export caps: 0x2 sync import caps: 0x0 GPU texture formats: NAME TYPE SIZE COMP CAPS EMU DEPTH HOST_BITS GLSL_TYPE GLSL_FMT FOURCC r8 UNORM 1 R SsLRbBVutHWG n {8 0 0 0 } {8 0 0 0 } float r8 R8 r8s SNORM 1 R SsLRbBVutHWG n {8 0 0 0 } {8 0 0 0 } float r8_snorm rg8 UNORM 2 RG SsLRbBVutHWG n {8 8 0 0 } {8 8 0 0 } vec2 rg8 GR88 rg8s SNORM 2 RG SsLRbBVutHWG n {8 8 0 0 } {8 8 0 0 } vec2 rg8_snorm rgba8 UNORM 4 RGBA SsLRbBVutHWG n {8 8 8 8 } {8 8 8 8 } vec4 rgba8 AB24 rgba8s SNORM 4 RGBA SsLRbBVutHWG n {8 8 8 8 } {8 8 8 8 } vec4 rgba8_snorm bgra8 UNORM 4 BGRA SsLRbBVutHWG n {8 8 8 8 } {8 8 8 8 } vec4 rgba8 AR24 rgb10a2 UNORM 4 RGBA SsLRbBVutHWG n {10 10 10 2 } {10 10 10 2 } vec4 rgb10_a2 AB30 r16 UNORM 2 R SsLRbBVutHWG n {16 0 0 0 } {16 0 0 0 } float r16 R16 r16hf FLOAT 2 R SsLRbBVutHWG n {16 0 0 0 } {16 0 0 0 } float r16f r16s SNORM 2 R SsLRbBVutHWG n {16 0 0 0 } {16 0 0 0 } float r16_snorm rg16 UNORM 4 RG SsLRbBVutHWG n {16 16 0 0 } {16 16 0 0 } vec2 rg16 GR32 rg16hf FLOAT 4 RG SsLRbBVutHWG n {16 16 0 0 } {16 16 0 0 } vec2 rg16f rg16s SNORM 4 RG SsLRbBVutHWG n {16 16 0 0 } {16 16 0 0 } vec2 rg16_snorm rgba16 UNORM 8 RGBA SsLRbBVutHWG n {16 16 16 16} {16 16 16 16} vec4 rgba16 rgba16hf FLOAT 8 RGBA SsLRbBVutHWG n {16 16 16 16} {16 16 16 16} vec4 rgba16f AB4H rgba16s SNORM 8 RGBA SsLRbBVutHWG n {16 16 16 16} {16 16 16 16} vec4 rgba16_snorm r32f FLOAT 4 R SsLRbBVutHWG n {32 0 0 0 } {32 0 0 0 } float r32f rg32f FLOAT 8 RG SsLRbBVutHWG n {32 32 0 0 } {32 32 0 0 } vec2 rg32f rgba32f FLOAT 16 RGBA SsLRbBVutHWG n {32 32 32 32} {32 32 32 32} vec4 rgba32f r8i SINT 1 R Ss-R-BVutHWG n {8 0 0 0 } {8 0 0 0 } int r8i r8u UINT 1 R Ss-R-BVutHWG n {8 0 0 0 } {8 0 0 0 } uint r8ui rg8i SINT 2 RG Ss-R-BVutHWG n {8 8 0 0 } {8 8 0 0 } ivec2 rg8i rg8u UINT 2 RG Ss-R-BVutHWG n {8 8 0 0 } {8 8 0 0 } uvec2 rg8ui rgba8i SINT 4 RGBA Ss-R-BVutHWG n {8 8 8 8 } {8 8 8 8 } ivec4 rgba8i rgba8u UINT 4 RGBA Ss-R-BVutHWG n {8 8 8 8 } {8 8 8 8 } uvec4 rgba8ui rgb10a2u UINT 4 RGBA Ss-R-BVutHWG n {10 10 10 2 } {10 10 10 2 } uvec4 rgb10_a2ui r16i SINT 2 R Ss-R-BVutHWG n {16 0 0 0 } {16 0 0 0 } int r16i r16u UINT 2 R Ss-R-BVutHWG n {16 0 0 0 } {16 0 0 0 } uint r16ui rg16i SINT 4 RG Ss-R-BVutHWG n {16 16 0 0 } {16 16 0 0 } ivec2 rg16i rg16u UINT 4 RG Ss-R-BVutHWG n {16 16 0 0 } {16 16 0 0 } uvec2 rg16ui rgba16i SINT 8 RGBA Ss-R-BVutHWG n {16 16 16 16} {16 16 16 16} ivec4 rgba16i rgba16u UINT 8 RGBA Ss-R-BVutHWG n {16 16 16 16} {16 16 16 16} uvec4 rgba16ui r32i SINT 4 R Ss-R-BVutHWG n {32 0 0 0 } {32 0 0 0 } int r32i r32u UINT 4 R Ss-R-BVutHWG n {32 0 0 0 } {32 0 0 0 } uint r32ui rg32i SINT 8 RG Ss-R-BVutHWG n {32 32 0 0 } {32 32 0 0 } ivec2 rg32i rg32u UINT 8 RG Ss-R-BVutHWG n {32 32 0 0 } {32 32 0 0 } uvec2 rg32ui rgba32i SINT 16 RGBA Ss-R-BVutHWG n {32 32 32 32} {32 32 32 32} ivec4 rgba32i rgba32u UINT 16 RGBA Ss-R-BVutHWG n {32 32 32 32} {32 32 32 32} uvec4 rgba32ui bgr10a2 UNORM 4 BGRA S-LRbBVu-H-G n {10 10 10 2 } {10 10 10 2 } vec4 AR30 a1bgr5 UNORM 2 ABGR S-LRbB-u-H-G n {1 5 5 5 } {1 5 5 5 } vec4 RA15 argb4 UNORM 2 ARGB S-LRbB-u-H-G n {4 4 4 4 } {4 4 4 4 } vec4 BA12 abgr4 UNORM 2 ABGR S-LRbB-u-H-G n {4 4 4 4 } {4 4 4 4 } vec4 RA12 bgr5a1 UNORM 2 BGRA S-LRbB-u-H-G n {5 5 5 1 } {5 5 5 1 } vec4 AR15 bgr565 UNORM 2 BGR S-LRbB-u-H-G n {5 6 5 0 } {5 6 5 0 } vec3 RG16 rgb8 UNORM 3 RGB S-L---Vu-H-G n {8 8 8 0 } {8 8 8 0 } vec3 BG24 rgb8s SNORM 3 RGB S-L---Vu-H-G n {8 8 8 0 } {8 8 8 0 } vec3 rgb16 UNORM 6 RGB S-L---Vu-H-G n {16 16 16 0 } {16 16 16 0 } vec3 rgb16hf FLOAT 6 RGB S-L---Vu-H-G n {16 16 16 0 } {16 16 16 0 } vec3 rgb16s SNORM 6 RGB S-L---Vu-H-G n {16 16 16 0 } {16 16 16 0 } vec3 a1rgb5 UNORM 2 ARGB S-L----u-H-G n {1 5 5 5 } {1 5 5 5 } vec4 BA15 rgb565 UNORM 2 RGB S-L----u-H-G n {5 6 5 0 } {5 6 5 0 } vec3 BG16 rgb32f FLOAT 12 RGB S-----Vu-H-G n {32 32 32 0 } {32 32 32 0 } vec3 rgb32i SINT 12 RGB S-----Vu-H-G n {32 32 32 0 } {32 32 32 0 } ivec3 rgb32u UINT 12 RGB S-----Vu-H-G n {32 32 32 0 } {32 32 32 0 } uvec3 rgb8i SINT 3 RGB S-----V--H-G n {8 8 8 0 } {8 8 8 0 } ivec3 rgb8u UINT 3 RGB S-----V--H-G n {8 8 8 0 } {8 8 8 0 } uvec3 rgb10a2i SINT 4 RGBA ------V--H-- n {10 10 10 2 } {10 10 10 2 } ivec4 rgb10a2s SNORM 4 RGBA ------V--H-- n {10 10 10 2 } {10 10 10 2 } vec4 bgr10a2i SINT 4 BGRA ------V--H-- n {10 10 10 2 } {10 10 10 2 } ivec4 bgr10a2s SNORM 4 BGRA ------V--H-- n {10 10 10 2 } {10 10 10 2 } vec4 bgr10a2u UINT 4 BGRA ------V--H-- n {10 10 10 2 } {10 10 10 2 } uvec4 rgb16i SINT 6 RGB ------V--H-- n {16 16 16 0 } {16 16 16 0 } ivec3 rgb16u UINT 6 RGB ------V--H-- n {16 16 16 0 } {16 16 16 0 } uvec3 gr4 UNORM 1 GR ---------H-- n {4 4 0 0 } {4 4 0 0 } bgr8 UNORM 3 BGR ---------H-- n {8 8 8 0 } {8 8 8 0 } RG24 bgr8i SINT 3 BGR ---------H-- n {8 8 8 0 } {8 8 8 0 } bgr8u UINT 3 BGR ---------H-- n {8 8 8 0 } {8 8 8 0 } bgra8i SINT 4 BGRA ---------H-- n {8 8 8 8 } {8 8 8 8 } bgra8u UINT 4 BGRA ---------H-- n {8 8 8 8 } {8 8 8 8 } rx10 UNORM 2 R ---------H-- n {10 0 0 0 } {16 0 0 0 } rxgx10 UNORM 4 RG ---------H-- n {10 10 0 0 } {16 16 0 0 } rxgxbxax10 UNORM 8 RGBA ---------H-- n {10 10 10 10} {16 16 16 16} AB10 rx12 UNORM 2 R ---------H-- n {12 0 0 0 } {16 0 0 0 } rxgx12 UNORM 4 RG ---------H-- n {12 12 0 0 } {16 16 0 0 } rxgxbxax12 UNORM 8 RGBA ---------H-- n {12 12 12 12} {16 16 16 16} r16f FLOAT 4 R SsLRbB---HWG y {16 0 0 0 } {32 0 0 0 } r16f rg16f FLOAT 8 RG SsLRbB---HWG y {16 16 0 0 } {32 32 0 0 } rg16f rgba16f FLOAT 16 RGBA SsLRbB---HWG y {16 16 16 16} {32 32 32 32} rgba16f rgb16f FLOAT 12 RGB S-L------H-G y {16 16 16 0 } {32 32 32 0 } g8_b8_r8_420 UNORM 0 ------------ n {8 8 8 0 } {0 0 0 0 } YU12 g8_b8_r8_422 UNORM 0 ------------ n {8 8 8 0 } {0 0 0 0 } YU16 g8_b8_r8_444 UNORM 0 ------------ n {8 8 8 0 } {0 0 0 0 } YU24 g8_br8_420 UNORM 0 ------------ n {8 8 8 0 } {0 0 0 0 } NV12 g8_br8_422 UNORM 0 ------------ n {8 8 8 0 } {0 0 0 0 } NV16 g8_br8_444 UNORM 0 ------------ n {8 8 8 0 } {0 0 0 0 } NV24 gx10_bx10_rx10_420 UNORM 0 ------------ n {10 10 10 0 } {0 0 0 0 } gx10_bx10_rx10_422 UNORM 0 ------------ n {10 10 10 0 } {0 0 0 0 } gx10_bx10_rx10_444 UNORM 0 ------------ n {10 10 10 0 } {0 0 0 0 } Q410 gx10_bxrx10_420 UNORM 0 ------------ n {10 10 10 0 } {0 0 0 0 } P010 gx10_bxrx10_422 UNORM 0 ------------ n {10 10 10 0 } {0 0 0 0 } P210 gx10_bxrx10_444 UNORM 0 ------------ n {10 10 10 0 } {0 0 0 0 } gx12_bx12_rx12_420 UNORM 0 ------------ n {12 12 12 0 } {0 0 0 0 } gx12_bx12_rx12_422 UNORM 0 ------------ n {12 12 12 0 } {0 0 0 0 } gx12_bx12_rx12_444 UNORM 0 ------------ n {12 12 12 0 } {0 0 0 0 } gx12_bxrx12_420 UNORM 0 ------------ n {12 12 12 0 } {0 0 0 0 } P012 gx12_bxrx12_422 UNORM 0 ------------ n {12 12 12 0 } {0 0 0 0 } gx12_bxrx12_444 UNORM 0 ------------ n {12 12 12 0 } {0 0 0 0 } g16_b16_r16_420 UNORM 0 ------------ n {16 16 16 0 } {0 0 0 0 } g16_b16_r16_422 UNORM 0 ------------ n {16 16 16 0 } {0 0 0 0 } g16_b16_r16_444 UNORM 0 ------------ n {16 16 16 0 } {0 0 0 0 } g16_br16_420 UNORM 0 ------------ n {16 16 16 0 } {0 0 0 0 } P016 g16_br16_422 UNORM 0 ------------ n {16 16 16 0 } {0 0 0 0 } g16_br16_444 UNORM 0 ------------ n {16 16 16 0 } {0 0 0 0 } Registering hook pass: FidelityFX Super Resolution v1.0.2 (EASU) Registering hook pass: FidelityFX Super Resolution v1.0.2 (RCAS) Loaded user shader: [ 1] // Copyright (c) 2021 Advanced Micro Devices, Inc. All rights reserved. [ 2] // [ 3] // Permission is hereby granted, free of charge, to any person obtaining a copy [ 4] // of this software and associated documentation files (the "Software"), to deal [ 5] // in the Software without restriction, including without limitation the rights [ 6] // to use, copy, modify, merge, publish, distribute, sublicense, and/or sell [ 7] // copies of the Software, and to permit persons to whom the Software is [ 8] // furnished to do so, subject to the following conditions: [ 9] // [ 10] // The above copyright notice and this permission notice shall be included in [ 11] // all copies or substantial portions of the Software. [ 12] // [ 13] // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR [ 14] // IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, [ 15] // FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE [ 16] // AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER [ 17] // LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, [ 18] // OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN [ 19] // THE SOFTWARE. [ 20] [ 21] // FidelityFX FSR v1.0.2 by AMD [ 22] // ported to mpv by agyild [ 23] [ 24] // Changelog [ 25] // Made it compatible with pre-OpenGL 4.0 renderers [ 26] // Made it directly operate on LUMA plane, since the original shader was operating on LUMA by deriving it from RGB. This should cause a major increase in performance, especially on OpenGL 4.0+ renderers (4+2 texture lookups vs. 12+5) [ 27] // Removed transparency preservation mechanism since the alpha channel is a separate source plane than LUMA [ 28] // Added optional performance-saving lossy optimizations to EASU (Credit: atyuwen, https://atyuwen.github.io/posts/optimizing-fsr/) [ 29] // [ 30] // Notes [ 31] // Per AMD's guidelines only upscales content up to 4x (e.g., 1080p -> 2160p, 720p -> 1440p etc.) and everything else in between, [ 32] // that means FSR will scale up to 4x at maximum, and any further scaling will be processed by mpv's scalers [ 33] [ 34] //!HOOK LUMA [ 35] //!BIND HOOKED [ 36] //!SAVE EASUTEX [ 37] //!DESC FidelityFX Super Resolution v1.0.2 (EASU) [ 38] //!WHEN OUTPUT.w OUTPUT.h * LUMA.w LUMA.h * / 1.0 > [ 39] //!WIDTH OUTPUT.w OUTPUT.w LUMA.w 2 * < * LUMA.w 2 * OUTPUT.w LUMA.w 2 * > * + OUTPUT.w OUTPUT.w LUMA.w 2 * = * + [ 40] //!HEIGHT OUTPUT.h OUTPUT.h LUMA.h 2 * < * LUMA.h 2 * OUTPUT.h LUMA.h 2 * > * + OUTPUT.h OUTPUT.h LUMA.h 2 * = * + [ 41] //!COMPONENTS 1 [ 42] [ 43] // User variables - EASU [ 44] #define FSR_PQ 0 // Whether the source content has PQ gamma or not. Needs to be set to the same value for both passes. 0 or 1. [ 45] #define FSR_EASU_DERING 1 // If set to 0, disables deringing for a small increase in performance. 0 or 1. [ 46] #define FSR_EASU_SIMPLE_ANALYSIS 0 // If set to 1, uses a simpler single-pass direction and length analysis for an increase in performance. 0 or 1. [ 47] #define FSR_EASU_QUIT_EARLY 0 // If set to 1, uses bilinear filtering for non-edge pixels and skips EASU on those regions for an increase in performance. 0 or 1. [ 48] [ 49] // Shader code [ 50] [ 51] #ifndef FSR_EASU_DIR_THRESHOLD [ 52] #if (FSR_EASU_QUIT_EARLY == 1) [ 53] #define FSR_EASU_DIR_THRESHOLD 64.0 [ 54] #elif (FSR_EASU_QUIT_EARLY == 0) [ 55] #define FSR_EASU_DIR_THRESHOLD 32768.0 [ 56] #endif [ 57] #endif [ 58] [ 59] float APrxLoRcpF1(float a) { [ 60] return uintBitsToFloat(uint(0x7ef07ebb) - floatBitsToUint(a)); [ 61] } [ 62] [ 63] float APrxLoRsqF1(float a) { [ 64] return uintBitsToFloat(uint(0x5f347d74) - (floatBitsToUint(a) >> uint(1))); [ 65] } [ 66] [ 67] float AMin3F1(float x, float y, float z) { [ 68] return min(x, min(y, z)); [ 69] } [ 70] [ 71] float AMax3F1(float x, float y, float z) { [ 72] return max(x, max(y, z)); [ 73] } [ 74] [ 75] #if (FSR_PQ == 1) [ 76] [ 77] float ToGamma2(float a) { [ 78] return pow(a, 4.0); [ 79] } [ 80] [ 81] #endif [ 82] [ 83] // Filtering for a given tap for the scalar. [ 84] void FsrEasuTap( [ 85] inout float aC, // Accumulated color, with negative lobe. [ 86] inout float aW, // Accumulated weight. [ 87] vec2 off, // Pixel offset from resolve position to tap. [ 88] vec2 dir, // Gradient direction. [ 89] vec2 len, // Length. [ 90] float lob, // Negative lobe strength. [ 91] float clp, // Clipping point. [ 92] float c){ // Tap color. [ 93] // Rotate offset by direction. [ 94] vec2 v; [ 95] v.x = (off.x * ( dir.x)) + (off.y * dir.y); [ 96] v.y = (off.x * (-dir.y)) + (off.y * dir.x); [ 97] // Anisotropy. [ 98] v *= len; [ 99] // Compute distance^2. [100] float d2 = v.x * v.x + v.y * v.y; [101] // Limit to the window as at corner, 2 taps can easily be outside. [102] d2 = min(d2, clp); [103] // Approximation of lancos2 without sin() or rcp(), or sqrt() to get x. [104] // (25/16 * (2/5 * x^2 - 1)^2 - (25/16 - 1)) * (1/4 * x^2 - 1)^2 [105] // |_______________________________________| |_______________| [106] // base window [107] // The general form of the 'base' is, [108] // (a*(b*x^2-1)^2-(a-1)) [109] // Where 'a=1/(2*b-b^2)' and 'b' moves around the negative lobe. [110] float wB = float(2.0 / 5.0) * d2 + -1.0; [111] float wA = lob * d2 + -1.0; [112] wB *= wB; [113] wA *= wA; [114] wB = float(25.0 / 16.0) * wB + float(-(25.0 / 16.0 - 1.0)); [115] float w = wB * wA; [116] // Do weighted average. [117] aC += c * w; [118] aW += w; [119] } [120] [121] // Accumulate direction and length. [122] void FsrEasuSet( [123] inout vec2 dir, [124] inout float len, [125] vec2 pp, [126] #if (FSR_EASU_SIMPLE_ANALYSIS == 1) [127] float b, float c, [128] float i, float j, float f, float e, [129] float k, float l, float h, float g, [130] float o, float n [131] #elif (FSR_EASU_SIMPLE_ANALYSIS == 0) [132] bool biS, bool biT, bool biU, bool biV, [133] float lA, float lB, float lC, float lD, float lE [134] #endif [135] ){ [136] // Compute bilinear weight, branches factor out as predicates are compiler time immediates. [137] // s t [138] // u v [139] #if (FSR_EASU_SIMPLE_ANALYSIS == 1) [140] vec4 w = vec4(0.0); [141] w.x = (1.0 - pp.x) * (1.0 - pp.y); [142] w.y = pp.x * (1.0 - pp.y); [143] w.z = (1.0 - pp.x) * pp.y; [144] w.w = pp.x * pp.y; [145] [146] float lA = dot(w, vec4(b, c, f, g)); [147] float lB = dot(w, vec4(e, f, i, j)); [148] float lC = dot(w, vec4(f, g, j, k)); [149] float lD = dot(w, vec4(g, h, k, l)); [150] float lE = dot(w, vec4(j, k, n, o)); [151] #elif (FSR_EASU_SIMPLE_ANALYSIS == 0) [152] float w = 0.0; [153] if (biS) [154] w = (1.0 - pp.x) * (1.0 - pp.y); [155] if (biT) [156] w = pp.x * (1.0 - pp.y); [157] if (biU) [158] w = (1.0 - pp.x) * pp.y; [159] if (biV) [160] w = pp.x * pp.y; [161] #endif [162] // Direction is the '+' diff. [163] // a [164] // b c d [165] // e [166] // Then takes magnitude from abs average of both sides of 'c'. [167] // Length converts gradient reversal to 0, smoothly to non-reversal at 1, shaped, then adding horz and vert terms. [168] float dc = lD - lC; [169] float cb = lC - lB; [170] float lenX = max(abs(dc), abs(cb)); [171] lenX = APrxLoRcpF1(lenX); [172] float dirX = lD - lB; [173] lenX = clamp(abs(dirX) * lenX, 0.0, 1.0); [174] lenX *= lenX; [175] // Repeat for the y axis. [176] float ec = lE - lC; [177] float ca = lC - lA; [178] float lenY = max(abs(ec), abs(ca)); [179] lenY = APrxLoRcpF1(lenY); [180] float dirY = lE - lA; [181] lenY = clamp(abs(dirY) * lenY, 0.0, 1.0); [182] lenY *= lenY; [183] #if (FSR_EASU_SIMPLE_ANALYSIS == 1) [184] len = lenX + lenY; [185] dir = vec2(dirX, dirY); [186] #elif (FSR_EASU_SIMPLE_ANALYSIS == 0) [187] dir += vec2(dirX, dirY) * w; [188] len += dot(vec2(w), vec2(lenX, lenY)); [189] #endif [190] } [191] [192] vec4 hook() { [193] // Result [194] vec4 pix = vec4(0.0, 0.0, 0.0, 1.0); [195] [196] //------------------------------------------------------------------------------------------------------------------------------ [197] // +---+---+ [198] // | | | [199] // +--(0)--+ [200] // | b | c | [201] // +---F---+---+---+ [202] // | e | f | g | h | [203] // +--(1)--+--(2)--+ [204] // | i | j | k | l | [205] // +---+---+---+---+ [206] // | n | o | [207] // +--(3)--+ [208] // | | | [209] // +---+---+ [210] // Get position of 'F'. [211] vec2 pp = HOOKED_pos * HOOKED_size - vec2(0.5); [212] vec2 fp = floor(pp); [213] pp -= fp; [214] //------------------------------------------------------------------------------------------------------------------------------ [215] // 12-tap kernel. [216] // b c [217] // e f g h [218] // i j k l [219] // n o [220] // Gather 4 ordering. [221] // a b [222] // r g [223] // Allowing dead-code removal to remove the 'z's. [224] #if (defined(HOOKED_gather) && (__VERSION__ >= 400 || (GL_ES && __VERSION__ >= 310))) [225] vec4 bczzL = HOOKED_gather(vec2((fp + vec2(1.0, -1.0)) * HOOKED_pt), 0); [226] vec4 ijfeL = HOOKED_gather(vec2((fp + vec2(0.0, 1.0)) * HOOKED_pt), 0); [227] vec4 klhgL = HOOKED_gather(vec2((fp + vec2(2.0, 1.0)) * HOOKED_pt), 0); [228] vec4 zzonL = HOOKED_gather(vec2((fp + vec2(1.0, 3.0)) * HOOKED_pt), 0); [229] #else [230] // pre-OpenGL 4.0 compatibility [231] float b = HOOKED_tex(vec2((fp + vec2(0.5, -0.5)) * HOOKED_pt)).r; [232] float c = HOOKED_tex(vec2((fp + vec2(1.5, -0.5)) * HOOKED_pt)).r; [233] [234] float e = HOOKED_tex(vec2((fp + vec2(-0.5, 0.5)) * HOOKED_pt)).r; [235] float f = HOOKED_tex(vec2((fp + vec2( 0.5, 0.5)) * HOOKED_pt)).r; [236] float g = HOOKED_tex(vec2((fp + vec2( 1.5, 0.5)) * HOOKED_pt)).r; [237] float h = HOOKED_tex(vec2((fp + vec2( 2.5, 0.5)) * HOOKED_pt)).r; [238] [239] float i = HOOKED_tex(vec2((fp + vec2(-0.5, 1.5)) * HOOKED_pt)).r; [240] float j = HOOKED_tex(vec2((fp + vec2( 0.5, 1.5)) * HOOKED_pt)).r; [241] float k = HOOKED_tex(vec2((fp + vec2( 1.5, 1.5)) * HOOKED_pt)).r; [242] float l = HOOKED_tex(vec2((fp + vec2( 2.5, 1.5)) * HOOKED_pt)).r; [243] [244] float n = HOOKED_tex(vec2((fp + vec2(0.5, 2.5) ) * HOOKED_pt)).r; [245] float o = HOOKED_tex(vec2((fp + vec2(1.5, 2.5) ) * HOOKED_pt)).r; [246] [247] vec4 bczzL = vec4(b, c, 0.0, 0.0); [248] vec4 ijfeL = vec4(i, j, f, e); [249] vec4 klhgL = vec4(k, l, h, g); [250] vec4 zzonL = vec4(0.0, 0.0, o, n); [251] #endif [252] //------------------------------------------------------------------------------------------------------------------------------ [253] // Rename. [254] float bL = bczzL.x; [255] float cL = bczzL.y; [256] float iL = ijfeL.x; [257] float jL = ijfeL.y; [258] float fL = ijfeL.z; [259] float eL = ijfeL.w; [260] float kL = klhgL.x; [261] float lL = klhgL.y; [262] float hL = klhgL.z; [263] float gL = klhgL.w; [264] float oL = zzonL.z; [265] float nL = zzonL.w; [266] [267] #if (FSR_PQ == 1) [268] // Not the most performance-friendly solution, but should work until mpv adds proper gamma transformation functions for shaders [269] bL = ToGamma2(bL); [270] cL = ToGamma2(cL); [271] iL = ToGamma2(iL); [272] jL = ToGamma2(jL); [273] fL = ToGamma2(fL); [274] eL = ToGamma2(eL); [275] kL = ToGamma2(kL); [276] lL = ToGamma2(lL); [277] hL = ToGamma2(hL); [278] gL = ToGamma2(gL); [279] oL = ToGamma2(oL); [280] nL = ToGamma2(nL); [281] #endif [282] [283] // Accumulate for bilinear interpolation. [284] vec2 dir = vec2(0.0); [285] float len = 0.0; [286] #if (FSR_EASU_SIMPLE_ANALYSIS == 1) [287] FsrEasuSet(dir, len, pp, bL, cL, iL, jL, fL, eL, kL, lL, hL, gL, oL, nL); [288] #elif (FSR_EASU_SIMPLE_ANALYSIS == 0) [289] FsrEasuSet(dir, len, pp, true, false, false, false, bL, eL, fL, gL, jL); [290] FsrEasuSet(dir, len, pp, false, true, false, false, cL, fL, gL, hL, kL); [291] FsrEasuSet(dir, len, pp, false, false, true, false, fL, iL, jL, kL, nL); [292] FsrEasuSet(dir, len, pp, false, false, false, true, gL, jL, kL, lL, oL); [293] #endif [294] //------------------------------------------------------------------------------------------------------------------------------ [295] // Normalize with approximation, and cleanup close to zero. [296] vec2 dir2 = dir * dir; [297] float dirR = dir2.x + dir2.y; [298] bool zro = dirR < float(1.0 / FSR_EASU_DIR_THRESHOLD); [299] dirR = APrxLoRsqF1(dirR); [300] #if (FSR_EASU_QUIT_EARLY == 1) [301] if (zro) { [302] vec4 w = vec4(0.0); [303] w.x = (1.0 - pp.x) * (1.0 - pp.y); [304] w.y = pp.x * (1.0 - pp.y); [305] w.z = (1.0 - pp.x) * pp.y; [306] w.w = pp.x * pp.y; [307] [308] pix.r = clamp(dot(w, vec4(fL, gL, jL, kL)), 0.0, 1.0); [309] return pix; [310] } [311] #elif (FSR_EASU_QUIT_EARLY == 0) [312] dirR = zro ? 1.0 : dirR; [313] dir.x = zro ? 1.0 : dir.x; [314] #endif [315] dir *= vec2(dirR); [316] // Transform from {0 to 2} to {0 to 1} range, and shape with square. [317] len = len * 0.5; [318] len *= len; [319] // Stretch kernel {1.0 vert|horz, to sqrt(2.0) on diagonal}. [320] float stretch = (dir.x * dir.x + dir.y * dir.y) * APrxLoRcpF1(max(abs(dir.x), abs(dir.y))); [321] // Anisotropic length after rotation, [322] // x := 1.0 lerp to 'stretch' on edges [323] // y := 1.0 lerp to 2x on edges [324] vec2 len2 = vec2(1.0 + (stretch - 1.0) * len, 1.0 + -0.5 * len); [325] // Based on the amount of 'edge', [326] // the window shifts from +/-{sqrt(2.0) to slightly beyond 2.0}. [327] float lob = 0.5 + float((1.0 / 4.0 - 0.04) - 0.5) * len; [328] // Set distance^2 clipping point to the end of the adjustable window. [329] float clp = APrxLoRcpF1(lob); [330] //------------------------------------------------------------------------------------------------------------------------------ [331] // Accumulation [332] // b c [333] // e f g h [334] // i j k l [335] // n o [336] float aC = 0.0; [337] float aW = 0.0; [338] FsrEasuTap(aC, aW, vec2( 0.0,-1.0) - pp, dir, len2, lob, clp, bL); // b [339] FsrEasuTap(aC, aW, vec2( 1.0,-1.0) - pp, dir, len2, lob, clp, cL); // c [340] FsrEasuTap(aC, aW, vec2(-1.0, 1.0) - pp, dir, len2, lob, clp, iL); // i [341] FsrEasuTap(aC, aW, vec2( 0.0, 1.0) - pp, dir, len2, lob, clp, jL); // j [342] FsrEasuTap(aC, aW, vec2( 0.0, 0.0) - pp, dir, len2, lob, clp, fL); // f [343] FsrEasuTap(aC, aW, vec2(-1.0, 0.0) - pp, dir, len2, lob, clp, eL); // e [344] FsrEasuTap(aC, aW, vec2( 1.0, 1.0) - pp, dir, len2, lob, clp, kL); // k [345] FsrEasuTap(aC, aW, vec2( 2.0, 1.0) - pp, dir, len2, lob, clp, lL); // l [346] FsrEasuTap(aC, aW, vec2( 2.0, 0.0) - pp, dir, len2, lob, clp, hL); // h [347] FsrEasuTap(aC, aW, vec2( 1.0, 0.0) - pp, dir, len2, lob, clp, gL); // g [348] FsrEasuTap(aC, aW, vec2( 1.0, 2.0) - pp, dir, len2, lob, clp, oL); // o [349] FsrEasuTap(aC, aW, vec2( 0.0, 2.0) - pp, dir, len2, lob, clp, nL); // n [350] //------------------------------------------------------------------------------------------------------------------------------ [351] // Normalize and dering. [352] pix.r = aC / aW; [353] #if (FSR_EASU_DERING == 1) [354] float min1 = min(AMin3F1(fL, gL, jL), kL); [355] float max1 = max(AMax3F1(fL, gL, jL), kL); [356] pix.r = clamp(pix.r, min1, max1); [357] #endif [358] pix.r = clamp(pix.r, 0.0, 1.0); [359] [360] return pix; [361] } [362] [363] //!HOOK LUMA [364] //!BIND EASUTEX [365] //!DESC FidelityFX Super Resolution v1.0.2 (RCAS) [366] //!WIDTH EASUTEX.w [367] //!HEIGHT EASUTEX.h [368] //!COMPONENTS 1 [369] [370] // User variables - RCAS [371] #define SHARPNESS 0.2 // Controls the amount of sharpening. The scale is {0.0 := maximum, to N>0, where N is the number of stops (halving) of the reduction of sharpness}. 0.0 to 2.0. [372] #define FSR_RCAS_DENOISE 1 // If set to 1, lessens the sharpening on noisy areas. Can be disabled for better performance. 0 or 1. [373] #define FSR_PQ 0 // Whether the source content has PQ gamma or not. Needs to be set to the same value for both passes. 0 or 1. [374] [375] // Shader code [376] [377] #define FSR_RCAS_LIMIT (0.25 - (1.0 / 16.0)) // This is set at the limit of providing unnatural results for sharpening. [378] [379] float APrxMedRcpF1(float a) { [380] float b = uintBitsToFloat(uint(0x7ef19fff) - floatBitsToUint(a)); [381] return b * (-b * a + 2.0); [382] } [383] [384] float AMax3F1(float x, float y, float z) { [385] return max(x, max(y, z)); [386] } [387] [388] float AMin3F1(float x, float y, float z) { [389] return min(x, min(y, z)); [390] } [391] [392] #if (FSR_PQ == 1) [393] [394] float FromGamma2(float a) { [395] return sqrt(sqrt(a)); [396] } [397] [398] #endif [399] [400] vec4 hook() { [401] // Algorithm uses minimal 3x3 pixel neighborhood. [402] // b [403] // d e f [404] // h [405] #if (defined(EASUTEX_gather) && (__VERSION__ >= 400 || (GL_ES && __VERSION__ >= 310))) [406] vec3 bde = EASUTEX_gather(EASUTEX_pos + EASUTEX_pt * vec2(-0.5), 0).xyz; [407] float b = bde.z; [408] float d = bde.x; [409] float e = bde.y; [410] [411] vec2 fh = EASUTEX_gather(EASUTEX_pos + EASUTEX_pt * vec2(0.5), 0).zx; [412] float f = fh.x; [413] float h = fh.y; [414] #else [415] float b = EASUTEX_texOff(vec2( 0.0, -1.0)).r; [416] float d = EASUTEX_texOff(vec2(-1.0, 0.0)).r; [417] float e = EASUTEX_tex(EASUTEX_pos).r; [418] float f = EASUTEX_texOff(vec2(1.0, 0.0)).r; [419] float h = EASUTEX_texOff(vec2(0.0, 1.0)).r; [420] #endif [421] [422] // Min and max of ring. [423] float mn1L = min(AMin3F1(b, d, f), h); [424] float mx1L = max(AMax3F1(b, d, f), h); [425] [426] // Immediate constants for peak range. [427] vec2 peakC = vec2(1.0, -1.0 * 4.0); [428] [429] // Limiters, these need to be high precision RCPs. [430] float hitMinL = min(mn1L, e) / (4.0 * mx1L); [431] float hitMaxL = (peakC.x - max(mx1L, e)) / (4.0 * mn1L + peakC.y); [432] float lobeL = max(-hitMinL, hitMaxL); [433] float lobe = max(float(-FSR_RCAS_LIMIT), min(lobeL, 0.0)) * exp2(-clamp(float(SHARPNESS), 0.0, 2.0)); [434] [435] // Apply noise removal. [436] #if (FSR_RCAS_DENOISE == 1) [437] // Noise detection. [438] float nz = 0.25 * b + 0.25 * d + 0.25 * f + 0.25 * h - e; [439] nz = clamp(abs(nz) * APrxMedRcpF1(AMax3F1(AMax3F1(b, d, e), f, h) - AMin3F1(AMin3F1(b, d, e), f, h)), 0.0, 1.0); [440] nz = -0.5 * nz + 1.0; [441] lobe *= nz; [442] #endif [443] [444] // Resolve, which needs the medium precision rcp approximation to avoid visible tonality changes. [445] float rcpL = APrxMedRcpF1(4.0 * lobe + 1.0); [446] vec4 pix = vec4(0.0, 0.0, 0.0, 1.0); [447] pix.r = float((lobe * b + lobe * d + lobe * h + lobe * f + e) * rcpL); [448] #if (FSR_PQ == 1) [449] pix.r = FromGamma2(pix.r); [450] #endif [451] [452] return pix; [453] } (Re)creating 1280x720x0 texture with format r16: ../../../../../src_packages/vs-placebo/src/shader.c:103 Allocating 7536640 memory of type 0x1 (id 0) in heap 0: ../../../../../src_packages/vs-placebo/src/shader.c:103 (Re)creating 1280x720x0 texture with format r16: ../../../../../src_packages/vs-placebo/src/shader.c:103 (Re)creating 1280x720x0 texture with format r16: ../../../../../src_packages/vs-placebo/src/shader.c:103 (Re)creating 1920x1080x0 texture with format rgb16: ../../../../../src_packages/vs-placebo/src/shader.c:122 Validation failed: !params->renderable || fmt_caps & PL_FMT_CAP_RENDERABLE (../../../../../src_packages/libplacebo/src/gpu.c:234) Backtrace: #0 0x7ffd93ee602d in pl_tex_create+0x3cd (C:\Users\Blob\AppData\Roaming\VapourSynth\plugins64\libvs_placebo.dll+0x1602d) (0x240a5602d) #1 0x7ffd93ee6730 in pl_tex_recreate+0x200 (C:\Users\Blob\AppData\Roaming\VapourSynth\plugins64\libvs_placebo.dll+0x16730) (0x240a56730) #2 0x7ffd93ed6f8a in VapourSynthPluginInit+0x580a (C:\Users\Blob\AppData\Roaming\VapourSynth\plugins64\libvs_placebo.dll+0x6f8a) (0x240a46f8a) #3 0x7ffd93ed7343 in VapourSynthPluginInit+0x5bc3 (C:\Users\Blob\AppData\Roaming\VapourSynth\plugins64\libvs_placebo.dll+0x7343) (0x240a47343) #4 0x7ffdb9fe6171 in getVapourSynthAPI+0xb101 (C:\Program Files\Python311\Lib\site-packages\VapourSynth.dll+0xe6171) (0x1800e6171) #5 0x7ffdb9ff8649 in getVapourSynthAPI+0x1d5d9 (C:\Program Files\Python311\Lib\site-packages\VapourSynth.dll+0xf8649) (0x1800f8649) #6 0x7ffdb9ffa122 in getVapourSynthAPI+0x1f0b2 (C:\Program Files\Python311\Lib\site-packages\VapourSynth.dll+0xfa122) (0x1800fa122) #7 0x7ffe0edc9362 in recalloc+0xa2 (C:\WINDOWS\System32\ucrtbase.dll+0x29362) (0x180029362) #8 0x7ffe101026ac in BaseThreadInitThunk+0x1c (C:\WINDOWS\System32\KERNEL32.DLL+0x126ac) (0x1800126ac) #9 0x7ffe113ca9f7 in RtlUserThreadStart+0x27 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x5a9f7) (0x18005a9f7) for texture: ../../../../../src_packages/vs-placebo/src/shader.c:122 Waiting for remaining commands... Memory heaps supported by device: 0: flags 0x1 size 3986M Memory types supported by device: 0: flags 0x1 heap 0 1: flags 0x7 heap 0 2: flags 0xf heap 0 Memory pool 0: Compatible types: 0x7 Optimal flags: 0x1 Slab 0: f x 1840K: 0 used 0 res 7360K alloc from heap 0, efficiency 100.00% [../../../../../src_packages/vs-placebo/src/shader.c:103] Pool summary: 0 used 0 res 7360K alloc, efficiency 100.00%, utilization 0.00% Memory summary: 0 used 0 res 7360K alloc, efficiency 100.00%, utilization 0.00%, max page: 249M Freeing slab of size 7360K ```

quietvoid commented 1 year ago

I'm not sure what's going on and if it's something that needs to be fixed in vs-placebo. Seems to be Intel specific at least.

dexeonify commented 1 year ago

Oh well, let's see if there's any progress on haasn/libplacebo#173. Unlike the OP though, I can get libplacebo to work in ffmpeg.

Lypheo / vs-placebo

Green blank output when using core.placebo.Shader #41

Environment

Minimal Working Example