nihui / realsr-ncnn-vulkan

RealSR super resolution implemented with ncnn library
MIT License
1.11k stars 113 forks source link

Upscaling slow and produces noise only #2

Closed Felixkruemel closed 4 years ago

Felixkruemel commented 4 years ago

If I try to upscale any png picture with RealSR ncnn it's unusable slow at around 10s/image (SRMD does 2fps) and produces these results: output I updated the GPU drivers to the newest version (445.87) but still the same result. What is going wrong?

My system config: Windows 10 RTX 2060 Super Ryzen 7 3700X 32GB DDR4

Most of the actual upscaling time is invested in decoding the picture (where I think the problem lays). The actual upscale goes really fast. @nihui

skerit commented 4 years ago

I had an similar issue where my rgba64be input pngs were being converted to transparent pngs. I lowered the pixel format to rgb8 (8 bits per color), and that fixed it. Converting the input to a JPG with 100% quality might also work.

Felixkruemel commented 4 years ago

@skerit I have tried converting it to jpg, but it still results in this behaviour.

I really don't know what the issue is then.

Is there any logging I could provide?

nihui commented 4 years ago

I had an similar issue where my rgba64be input pngs were being converted to transparent pngs. I lowered the pixel format to rgb8 (8 bits per color), and that fixed it. Converting the input to a JPG with 100% quality might also work.

alpha channel was not handled properly, but it has been fixed now.

Felixkruemel commented 4 years ago

Just for the current standing: So, did remove the driver with DUD, rebooted, installed it again, rebooted, tried realsr again with the same result. Doesn't seem to have anything to do with the driver.

Does anybody have the same GPU or one of the RTX series? Maybe it's a problem there

Felixkruemel commented 4 years ago

I've tested it with a 20x20 image now, just to see wether it may be a graphic storage problem. But nope, same behaviour here Original: 20

After Upscaling: 20out

Felixkruemel commented 4 years ago

Tried the latest artifact from today - still the same

frankenstein91 commented 4 years ago

There seems to be a problem on your system. I have loaded and tested the image, I get a good image displayed here without any problems. Please check the MD5 sum for your Original. My Download is: 449dff1cdf8ecb58af3598a9b5d78eeb 83051571-7dff5480-a04e-11ea-9d51-0ab8ef640be3 test test1

Felixkruemel commented 4 years ago

For the .zip file (using this Windows build) it's 3d48af76f5275217b0826fbe39f45bfa and for the .exe it's 6f839e4d20bf40dcfbbf26b84e110ffe

What had you tested @frankenstein91 ? If you are using Linux, that error may not exist there.

Edit: The original picture is 449dff1cdf8ecb58af3598a9b5d78eeb So that's the same

frankenstein91 commented 4 years ago

I have tested your 20x20 image on a notebook with Linux and Intel GPU. And I want to make sure we're testing the same image.

Felixkruemel commented 4 years ago

So then it seems GH doesn't change the picture and it's the same @frankenstein91

Felixkruemel commented 4 years ago

Update: I tried it on my SurfacePro 7 with an Intel Iris GPU. The result is a black picture.

Something definitely is wrong with it: Input: landschaft-4498 Output: output

Console Log: image

The process was also really fast with 2s or so. Too fast for an iGPU I would say.

extrimexxx commented 4 years ago

@nihui on my pc win10 pro 10.0.18363.836 nvidia gtx1070 driver 446.14 realsr-ncnn-vulkan from pre-compiled realsr-...-20200530-windows.zip - full black image from CI realsr-...-artifact-windows-2016 and realsr-...-artifact-windows-2019 - full black image srmd-ncnn-vulkan from pre-compiled srmd-...-20200530-windows.zip - great work waifu2x-ncnn-vulkan from pre-compiled waifu2x-...-20200530-windows.zip - green image

I'm ready for cooperation, help, testing.

Spoiler vulkan info: ``` ========== VULKANINFO ========== Vulkan Instance Version: 1.2.135 Instance Extensions: count = 13 =============================== VK_EXT_debug_report : extension revision 9 VK_EXT_debug_utils : extension revision 1 VK_EXT_swapchain_colorspace : extension revision 4 VK_KHR_device_group_creation : extension revision 1 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_win32_surface : extension revision 6 VK_NV_external_memory_capabilities : extension revision 1 Layers: count = 6 ================= VK_LAYER_NV_optimus (NVIDIA Optimus layer) Vulkan version 1.1.126, layer version 1: Layer Extensions: count = 0 Devices: count = 1 GPU id = 0 (GeForce GTX 1070) Layer-Device Extensions: count = 0 VK_LAYER_OBS_HOOK (Open Broadcaster Software hook) Vulkan version 1.2.131, layer version 1: Layer Extensions: count = 0 Devices: count = 1 GPU id = 0 (GeForce GTX 1070) Layer-Device Extensions: count = 0 VK_LAYER_ROCKSTAR_GAMES_social_club (Rockstar Games Social Club Layer) Vulkan version 1.0.70, layer version 1: Layer Extensions: count = 0 Devices: count = 1 GPU id = 0 (GeForce GTX 1070) Layer-Device Extensions: count = 0 VK_LAYER_VALVE_steam_fossilize (Steam Pipeline Caching Layer) Vulkan version 1.1.73, layer version 1: Layer Extensions: count = 0 Devices: count = 1 GPU id = 0 (GeForce GTX 1070) Layer-Device Extensions: count = 0 VK_LAYER_VALVE_steam_overlay (Steam Overlay Layer) Vulkan version 1.1.73, layer version 1: Layer Extensions: count = 0 Devices: count = 1 GPU id = 0 (GeForce GTX 1070) Layer-Device Extensions: count = 0 VK_LAYER_fpsmon (FpsMonitor overlay/capture layer) Vulkan version 1.1.70, layer version 1: Layer Extensions: count = 0 Devices: count = 1 GPU id = 0 (GeForce GTX 1070) Layer-Device Extensions: count = 0 Presentable Surfaces: ===================== GPU id : 0 (GeForce GTX 1070): Surface type = VK_KHR_win32_surface Formats: count = 2 SurfaceFormat[0]: format = FORMAT_B8G8R8A8_UNORM colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR SurfaceFormat[1]: format = FORMAT_B8G8R8A8_SRGB colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR Present Modes: count = 4 PRESENT_MODE_FIFO_KHR PRESENT_MODE_FIFO_RELAXED_KHR PRESENT_MODE_MAILBOX_KHR PRESENT_MODE_IMMEDIATE_KHR VkSurfaceCapabilitiesKHR: ------------------------- minImageCount = 2 maxImageCount = 8 currentExtent: width = 256 height = 256 minImageExtent: width = 256 height = 256 maxImageExtent: width = 256 height = 256 maxImageArrayLayers = 1 supportedTransforms: count = 1 SURFACE_TRANSFORM_IDENTITY_BIT_KHR currentTransform = SURFACE_TRANSFORM_IDENTITY_BIT_KHR supportedCompositeAlpha: count = 1 COMPOSITE_ALPHA_OPAQUE_BIT_KHR supportedUsageFlags: count = 6 IMAGE_USAGE_TRANSFER_SRC_BIT IMAGE_USAGE_TRANSFER_DST_BIT IMAGE_USAGE_SAMPLED_BIT IMAGE_USAGE_STORAGE_BIT IMAGE_USAGE_COLOR_ATTACHMENT_BIT IMAGE_USAGE_INPUT_ATTACHMENT_BIT VkSurfaceCapabilitiesFullScreenExclusiveEXT: -------------------------------------------- fullScreenExclusiveSupported = false VkSurfaceProtectedCapabilitiesKHR: ---------------------------------- supportsProtected = false Device Groups: ============== Group 0: Properties: physicalDevices: count = 1 GeForce GTX 1070 (ID: 0) subsetAllocation = 1 Present Capabilities: GeForce GTX 1070 (ID: 0): Can present images from the following devices: count = 1 GeForce GTX 1070 (ID: 0) Present modes: count = 1 DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR Device Properties and Extensions: ================================= GPU0: VkPhysicalDeviceProperties: --------------------------- apiVersion = 4198526 (1.1.126) driverVersion = 1870888960 (0x6f838000) vendorID = 0x10de deviceID = 0x1b81 deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU deviceName = GeForce GTX 1070 VkPhysicalDeviceLimits: ----------------------- maxImageDimension1D = 32768 maxImageDimension2D = 32768 maxImageDimension3D = 16384 maxImageDimensionCube = 32768 maxImageArrayLayers = 2048 maxTexelBufferElements = 134217728 maxUniformBufferRange = 65536 maxStorageBufferRange = 4294967295 maxPushConstantsSize = 256 maxMemoryAllocationCount = 4096 maxSamplerAllocationCount = 4000 bufferImageGranularity = 0x00000400 sparseAddressSpaceSize = 0xffffffffffffffff maxBoundDescriptorSets = 32 maxPerStageDescriptorSamplers = 1048576 maxPerStageDescriptorUniformBuffers = 15 maxPerStageDescriptorStorageBuffers = 1048576 maxPerStageDescriptorSampledImages = 1048576 maxPerStageDescriptorStorageImages = 1048576 maxPerStageDescriptorInputAttachments = 1048576 maxPerStageResources = 4294967295 maxDescriptorSetSamplers = 1048576 maxDescriptorSetUniformBuffers = 180 maxDescriptorSetUniformBuffersDynamic = 15 maxDescriptorSetStorageBuffers = 1048576 maxDescriptorSetStorageBuffersDynamic = 16 maxDescriptorSetSampledImages = 1048576 maxDescriptorSetStorageImages = 1048576 maxDescriptorSetInputAttachments = 1048576 maxVertexInputAttributes = 32 maxVertexInputBindings = 32 maxVertexInputAttributeOffset = 2047 maxVertexInputBindingStride = 2048 maxVertexOutputComponents = 128 maxTessellationGenerationLevel = 64 maxTessellationPatchSize = 32 maxTessellationControlPerVertexInputComponents = 128 maxTessellationControlPerVertexOutputComponents = 128 maxTessellationControlPerPatchOutputComponents = 120 maxTessellationControlTotalOutputComponents = 4216 maxTessellationEvaluationInputComponents = 128 maxTessellationEvaluationOutputComponents = 128 maxGeometryShaderInvocations = 32 maxGeometryInputComponents = 128 maxGeometryOutputComponents = 128 maxGeometryOutputVertices = 1024 maxGeometryTotalOutputComponents = 1024 maxFragmentInputComponents = 128 maxFragmentOutputAttachments = 8 maxFragmentDualSrcAttachments = 1 maxFragmentCombinedOutputResources = 16 maxComputeSharedMemorySize = 49152 maxComputeWorkGroupCount: count = 3 2147483647 65535 65535 maxComputeWorkGroupInvocations = 1536 maxComputeWorkGroupSize: count = 3 1536 1024 64 subPixelPrecisionBits = 8 subTexelPrecisionBits = 8 mipmapPrecisionBits = 8 maxDrawIndexedIndexValue = 4294967295 maxDrawIndirectCount = 4294967295 maxSamplerLodBias = 15 maxSamplerAnisotropy = 16 maxViewports = 16 maxViewportDimensions: count = 2 32768 32768 viewportBoundsRange: count = 2 -65536 65536 viewportSubPixelBits = 8 minMemoryMapAlignment = 64 minTexelBufferOffsetAlignment = 0x00000010 minUniformBufferOffsetAlignment = 0x00000100 minStorageBufferOffsetAlignment = 0x00000010 minTexelOffset = -8 maxTexelOffset = 7 minTexelGatherOffset = -32 maxTexelGatherOffset = 31 minInterpolationOffset = -0.5 maxInterpolationOffset = 0.4375 subPixelInterpolationOffsetBits = 4 maxFramebufferWidth = 32768 maxFramebufferHeight = 32768 maxFramebufferLayers = 2048 framebufferColorSampleCounts: count = 4 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT framebufferDepthSampleCounts: count = 4 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT framebufferStencilSampleCounts: count = 5 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT framebufferNoAttachmentsSampleCounts: count = 5 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT maxColorAttachments = 8 sampledImageColorSampleCounts: count = 4 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT sampledImageIntegerSampleCounts: count = 4 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT sampledImageDepthSampleCounts: count = 4 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT sampledImageStencilSampleCounts: count = 5 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT storageImageSampleCounts: count = 4 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT maxSampleMaskWords = 1 timestampComputeAndGraphics = true timestampPeriod = 1 maxClipDistances = 8 maxCullDistances = 8 maxCombinedClipAndCullDistances = 8 discreteQueuePriorities = 2 pointSizeRange: count = 2 1 2047.94 lineWidthRange: count = 2 1 64 pointSizeGranularity = 0.0625 lineWidthGranularity = 0.0625 strictLines = true standardSampleLocations = true optimalBufferCopyOffsetAlignment = 0x00000001 optimalBufferCopyRowPitchAlignment = 0x00000001 nonCoherentAtomSize = 0x00000040 VkPhysicalDeviceSparseProperties: --------------------------------- residencyStandard2DBlockShape = true residencyStandard2DMultisampleBlockShape = true residencyStandard3DBlockShape = true residencyAlignedMipSize = false residencyNonResidentStrict = true VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT: ---------------------------------------------------- advancedBlendMaxColorAttachments = 8 advancedBlendIndependentBlend = false advancedBlendNonPremultipliedSrcColor = true advancedBlendNonPremultipliedDstColor = true advancedBlendCorrelatedOverlap = true advancedBlendAllOperations = true VkPhysicalDeviceConservativeRasterizationPropertiesEXT: ------------------------------------------------------- primitiveOverestimationSize = 0 maxExtraPrimitiveOverestimationSize = 0.75 extraPrimitiveOverestimationSizeGranularity = 0.25 primitiveUnderestimation = false conservativePointAndLineRasterization = true degenerateTrianglesRasterized = true degenerateLinesRasterized = false fullyCoveredFragmentShaderInputVariable = false conservativeRasterizationPostDepthCoverage = true VkPhysicalDeviceDepthStencilResolvePropertiesKHR: ------------------------------------------------- supportedDepthResolveModes: count = 4 RESOLVE_MODE_SAMPLE_ZERO_BIT RESOLVE_MODE_AVERAGE_BIT RESOLVE_MODE_MIN_BIT RESOLVE_MODE_MAX_BIT supportedStencilResolveModes: count = 3 RESOLVE_MODE_SAMPLE_ZERO_BIT RESOLVE_MODE_MIN_BIT RESOLVE_MODE_MAX_BIT independentResolveNone = true independentResolve = true VkPhysicalDeviceDescriptorIndexingPropertiesEXT: ------------------------------------------------ maxUpdateAfterBindDescriptorsInAllPools = 4294967295 shaderUniformBufferArrayNonUniformIndexingNative = true shaderSampledImageArrayNonUniformIndexingNative = true shaderStorageBufferArrayNonUniformIndexingNative = true shaderStorageImageArrayNonUniformIndexingNative = true shaderInputAttachmentArrayNonUniformIndexingNative = true robustBufferAccessUpdateAfterBind = true quadDivergentImplicitLod = true maxPerStageDescriptorUpdateAfterBindSamplers = 1048576 maxPerStageDescriptorUpdateAfterBindUniformBuffers = 15 maxPerStageDescriptorUpdateAfterBindStorageBuffers = 1048576 maxPerStageDescriptorUpdateAfterBindSampledImages = 1048576 maxPerStageDescriptorUpdateAfterBindStorageImages = 1048576 maxPerStageDescriptorUpdateAfterBindInputAttachments = 1048576 maxPerStageUpdateAfterBindResources = 4294967295 maxDescriptorSetUpdateAfterBindSamplers = 1048576 maxDescriptorSetUpdateAfterBindUniformBuffers = 180 maxDescriptorSetUpdateAfterBindUniformBuffersDynamic = 15 maxDescriptorSetUpdateAfterBindStorageBuffers = 1048576 maxDescriptorSetUpdateAfterBindStorageBuffersDynamic = 16 maxDescriptorSetUpdateAfterBindSampledImages = 1048576 maxDescriptorSetUpdateAfterBindStorageImages = 1048576 maxDescriptorSetUpdateAfterBindInputAttachments = 1048576 VkPhysicalDeviceDiscardRectanglePropertiesEXT: ---------------------------------------------- maxDiscardRectangles = 8 VkPhysicalDeviceDriverPropertiesKHR: ------------------------------------ driverID = DRIVER_ID_NVIDIA_PROPRIETARY driverName = NVIDIA driverInfo = 446.14 conformanceVersion = 1.1.6.0 VkPhysicalDeviceExternalMemoryHostPropertiesEXT: ------------------------------------------------ minImportedHostPointerAlignment = 0x00001000 VkPhysicalDeviceFloatControlsPropertiesKHR: ------------------------------------------- denormBehaviorIndependence = SHADER_FLOAT_CONTROLS_INDEPENDENCE_ALL roundingModeIndependence = SHADER_FLOAT_CONTROLS_INDEPENDENCE_ALL shaderSignedZeroInfNanPreserveFloat16 = true shaderSignedZeroInfNanPreserveFloat32 = true shaderSignedZeroInfNanPreserveFloat64 = true shaderDenormPreserveFloat16 = true shaderDenormPreserveFloat32 = false shaderDenormPreserveFloat64 = false shaderDenormFlushToZeroFloat16 = false shaderDenormFlushToZeroFloat32 = false shaderDenormFlushToZeroFloat64 = false shaderRoundingModeRTEFloat16 = true shaderRoundingModeRTEFloat32 = true shaderRoundingModeRTEFloat64 = true shaderRoundingModeRTZFloat16 = false shaderRoundingModeRTZFloat32 = true shaderRoundingModeRTZFloat64 = true VkPhysicalDeviceIDProperties: ----------------------------- deviceUUID = 9b446d86-8b3d-387a-7b8d-af06f8d448a5 driverUUID = e666b447-76fc-30e8-f9db-6e259f87e651 deviceLUID = 04d50000-00000000 deviceNodeMask = 1 deviceLUIDValid = true VkPhysicalDeviceInlineUniformBlockPropertiesEXT: ------------------------------------------------ maxInlineUniformBlockSize = 256 maxPerStageDescriptorInlineUniformBlocks = 32 maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks = 32 maxDescriptorSetInlineUniformBlocks = 32 maxDescriptorSetUpdateAfterBindInlineUniformBlocks = 32 VkPhysicalDeviceLineRasterizationPropertiesEXT: ----------------------------------------------- lineSubPixelPrecisionBits = 8 VkPhysicalDeviceMaintenance3Properties: --------------------------------------- maxPerSetDescriptors = 4294967295 maxMemoryAllocationSize = 0xffe00000 VkPhysicalDeviceMultiviewProperties: ------------------------------------ maxMultiviewViewCount = 32 maxMultiviewInstanceIndex = 134217727 VkPhysicalDevicePCIBusInfoPropertiesEXT: ---------------------------------------- pciDomain = 0 pciBus = 1 pciDevice = 0 pciFunction = 0 VkPhysicalDevicePointClippingProperties: ---------------------------------------- pointClippingBehavior = POINT_CLIPPING_BEHAVIOR_USER_CLIP_PLANES_ONLY VkPhysicalDeviceProtectedMemoryProperties: ------------------------------------------ protectedNoFault = false VkPhysicalDevicePushDescriptorPropertiesKHR: -------------------------------------------- maxPushDescriptors = 32 VkPhysicalDeviceSampleLocationsPropertiesEXT: --------------------------------------------- sampleLocationSampleCounts: count = 5 SAMPLE_COUNT_1_BIT SAMPLE_COUNT_2_BIT SAMPLE_COUNT_4_BIT SAMPLE_COUNT_8_BIT SAMPLE_COUNT_16_BIT maxSampleLocationGridSize: width = 1 height = 1 sampleLocationCoordinateRange: count = 2 0 0.9375 sampleLocationSubPixelBits = 4 variableSampleLocations = true VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT: ------------------------------------------------- filterMinmaxSingleComponentFormats = true filterMinmaxImageComponentMapping = true VkPhysicalDeviceSubgroupProperties: ----------------------------------- subgroupSize = 32 supportedStages: count = 14 SHADER_STAGE_VERTEX_BIT SHADER_STAGE_TESSELLATION_CONTROL_BIT SHADER_STAGE_TESSELLATION_EVALUATION_BIT SHADER_STAGE_GEOMETRY_BIT SHADER_STAGE_FRAGMENT_BIT SHADER_STAGE_COMPUTE_BIT SHADER_STAGE_ALL_GRAPHICS SHADER_STAGE_ALL SHADER_STAGE_RAYGEN_BIT_KHR SHADER_STAGE_ANY_HIT_BIT_KHR SHADER_STAGE_CLOSEST_HIT_BIT_KHR SHADER_STAGE_MISS_BIT_KHR SHADER_STAGE_INTERSECTION_BIT_KHR SHADER_STAGE_CALLABLE_BIT_KHR supportedOperations: count = 9 SUBGROUP_FEATURE_BASIC_BIT SUBGROUP_FEATURE_VOTE_BIT SUBGROUP_FEATURE_ARITHMETIC_BIT SUBGROUP_FEATURE_BALLOT_BIT SUBGROUP_FEATURE_SHUFFLE_BIT SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT SUBGROUP_FEATURE_CLUSTERED_BIT SUBGROUP_FEATURE_QUAD_BIT SUBGROUP_FEATURE_PARTITIONED_BIT_NV quadOperationsInAllStages = true VkPhysicalDeviceSubgroupSizeControlPropertiesEXT: ------------------------------------------------- minSubgroupSize = 32 maxSubgroupSize = 32 maxComputeWorkgroupSubgroups = 3145728 requiredSubgroupSizeStages: count = 14 SHADER_STAGE_VERTEX_BIT SHADER_STAGE_TESSELLATION_CONTROL_BIT SHADER_STAGE_TESSELLATION_EVALUATION_BIT SHADER_STAGE_GEOMETRY_BIT SHADER_STAGE_FRAGMENT_BIT SHADER_STAGE_COMPUTE_BIT SHADER_STAGE_ALL_GRAPHICS SHADER_STAGE_ALL SHADER_STAGE_RAYGEN_BIT_KHR SHADER_STAGE_ANY_HIT_BIT_KHR SHADER_STAGE_CLOSEST_HIT_BIT_KHR SHADER_STAGE_MISS_BIT_KHR SHADER_STAGE_INTERSECTION_BIT_KHR SHADER_STAGE_CALLABLE_BIT_KHR VkPhysicalDeviceTexelBufferAlignmentPropertiesEXT: -------------------------------------------------- storageTexelBufferOffsetAlignmentBytes = 0x00000010 storageTexelBufferOffsetSingleTexelAlignment = true uniformTexelBufferOffsetAlignmentBytes = 0x00000010 uniformTexelBufferOffsetSingleTexelAlignment = true VkPhysicalDeviceTimelineSemaphorePropertiesKHR: ----------------------------------------------- maxTimelineSemaphoreValueDifference = 2147483647 VkPhysicalDeviceTransformFeedbackPropertiesEXT: ----------------------------------------------- maxTransformFeedbackStreams = 4 maxTransformFeedbackBuffers = 4 maxTransformFeedbackBufferSize = 0xffffffffffffffff maxTransformFeedbackStreamDataSize = 2048 maxTransformFeedbackBufferDataSize = 512 maxTransformFeedbackBufferDataStride = 2048 transformFeedbackQueries = true transformFeedbackStreamsLinesTriangles = false transformFeedbackRasterizationStreamSelect = true transformFeedbackDraw = true VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT: ---------------------------------------------------- maxVertexAttribDivisor = 4294967295 Device Extensions: count = 99 ----------------------------- VK_EXT_blend_operation_advanced : extension revision 2 VK_EXT_buffer_device_address : extension revision 2 VK_EXT_calibrated_timestamps : extension revision 1 VK_EXT_conditional_rendering : extension revision 2 VK_EXT_conservative_rasterization : extension revision 1 VK_EXT_depth_clip_enable : extension revision 1 VK_EXT_depth_range_unrestricted : extension revision 1 VK_EXT_descriptor_indexing : extension revision 2 VK_EXT_discard_rectangles : extension revision 1 VK_EXT_external_memory_host : extension revision 1 VK_EXT_fragment_shader_interlock : extension revision 1 VK_EXT_full_screen_exclusive : extension revision 4 VK_EXT_hdr_metadata : extension revision 2 VK_EXT_host_query_reset : extension revision 1 VK_EXT_index_type_uint8 : extension revision 1 VK_EXT_inline_uniform_block : extension revision 1 VK_EXT_line_rasterization : extension revision 1 VK_EXT_memory_budget : extension revision 1 VK_EXT_memory_priority : extension revision 1 VK_EXT_pci_bus_info : extension revision 2 VK_EXT_pipeline_creation_feedback : extension revision 1 VK_EXT_post_depth_coverage : extension revision 1 VK_EXT_sample_locations : extension revision 1 VK_EXT_sampler_filter_minmax : extension revision 2 VK_EXT_scalar_block_layout : extension revision 1 VK_EXT_separate_stencil_usage : extension revision 1 VK_EXT_shader_demote_to_helper_invocation : extension revision 1 VK_EXT_shader_subgroup_ballot : extension revision 1 VK_EXT_shader_subgroup_vote : extension revision 1 VK_EXT_shader_viewport_index_layer : extension revision 1 VK_EXT_subgroup_size_control : extension revision 2 VK_EXT_texel_buffer_alignment : extension revision 1 VK_EXT_transform_feedback : extension revision 1 VK_EXT_vertex_attribute_divisor : extension revision 3 VK_EXT_ycbcr_image_arrays : extension revision 1 VK_KHR_16bit_storage : extension revision 1 VK_KHR_8bit_storage : extension revision 1 VK_KHR_bind_memory2 : extension revision 1 VK_KHR_create_renderpass2 : extension revision 1 VK_KHR_dedicated_allocation : extension revision 3 VK_KHR_depth_stencil_resolve : extension revision 1 VK_KHR_descriptor_update_template : extension revision 1 VK_KHR_device_group : extension revision 4 VK_KHR_draw_indirect_count : extension revision 1 VK_KHR_driver_properties : extension revision 1 VK_KHR_external_fence : extension revision 1 VK_KHR_external_fence_win32 : extension revision 1 VK_KHR_external_memory : extension revision 1 VK_KHR_external_memory_win32 : extension revision 1 VK_KHR_external_semaphore : extension revision 1 VK_KHR_external_semaphore_win32 : extension revision 1 VK_KHR_get_memory_requirements2 : extension revision 1 VK_KHR_image_format_list : extension revision 1 VK_KHR_imageless_framebuffer : extension revision 1 VK_KHR_maintenance1 : extension revision 2 VK_KHR_maintenance2 : extension revision 1 VK_KHR_maintenance3 : extension revision 1 VK_KHR_multiview : extension revision 1 VK_KHR_pipeline_executable_properties : extension revision 1 VK_KHR_push_descriptor : extension revision 2 VK_KHR_relaxed_block_layout : extension revision 1 VK_KHR_sampler_mirror_clamp_to_edge : extension revision 3 VK_KHR_sampler_ycbcr_conversion : extension revision 14 VK_KHR_shader_atomic_int64 : extension revision 1 VK_KHR_shader_clock : extension revision 1 VK_KHR_shader_draw_parameters : extension revision 1 VK_KHR_shader_float16_int8 : extension revision 1 VK_KHR_shader_float_controls : extension revision 4 VK_KHR_shader_subgroup_extended_types : extension revision 1 VK_KHR_spirv_1_4 : extension revision 1 VK_KHR_storage_buffer_storage_class : extension revision 1 VK_KHR_swapchain : extension revision 70 VK_KHR_swapchain_mutable_format : extension revision 1 VK_KHR_timeline_semaphore : extension revision 2 VK_KHR_uniform_buffer_standard_layout : extension revision 1 VK_KHR_variable_pointers : extension revision 1 VK_KHR_vulkan_memory_model : extension revision 3 VK_KHR_win32_keyed_mutex : extension revision 1 VK_NVX_device_generated_commands : extension revision 3 VK_NVX_multiview_per_view_attributes : extension revision 1 VK_NV_clip_space_w_scaling : extension revision 1 VK_NV_coverage_reduction_mode : extension revision 1 VK_NV_dedicated_allocation : extension revision 1 VK_NV_dedicated_allocation_image_aliasing : extension revision 1 VK_NV_device_diagnostic_checkpoints : extension revision 2 VK_NV_device_diagnostics_config : extension revision 1 VK_NV_external_memory : extension revision 1 VK_NV_external_memory_win32 : extension revision 1 VK_NV_fill_rectangle : extension revision 1 VK_NV_fragment_coverage_to_color : extension revision 1 VK_NV_framebuffer_mixed_samples : extension revision 1 VK_NV_geometry_shader_passthrough : extension revision 1 VK_NV_ray_tracing : extension revision 3 VK_NV_sample_mask_override_coverage : extension revision 1 VK_NV_shader_sm_builtins : extension revision 1 VK_NV_shader_subgroup_partitioned : extension revision 1 VK_NV_viewport_array2 : extension revision 1 VK_NV_viewport_swizzle : extension revision 1 VK_NV_win32_keyed_mutex : extension revision 2 VkQueueFamilyProperties: ======================== queueProperties[0]: ------------------- minImageTransferGranularity = (1,1,1) queueCount = 16 queueFlags = QUEUE_GRAPHICS | QUEUE_COMPUTE | QUEUE_TRANSFER | QUEUE_SPARSE_BINDING timestampValidBits = 64 present support = false queueProperties[1]: ------------------- minImageTransferGranularity = (1,1,1) queueCount = 2 queueFlags = QUEUE_TRANSFER | QUEUE_SPARSE_BINDING timestampValidBits = 64 present support = false queueProperties[2]: ------------------- minImageTransferGranularity = (1,1,1) queueCount = 8 queueFlags = QUEUE_COMPUTE | QUEUE_TRANSFER | QUEUE_SPARSE_BINDING timestampValidBits = 64 present support = false VkPhysicalDeviceMemoryProperties: ================================= memoryHeaps: count = 2 memoryHeaps[0]: size = 8480882688 (0x1f9800000) (7.90 GiB) budget = 7208750284 (0x1adaccccc) (6.71 GiB) usage = 0 (0x00000000) (0.00 B) flags: count = 1 MEMORY_HEAP_DEVICE_LOCAL_BIT memoryHeaps[1]: size = 8558645248 (0x1fe229000) (7.97 GiB) budget = 7702782566 (0x1cb1f2266) (7.17 GiB) usage = 0 (0x00000000) (0.00 B) flags: count = 0 None memoryTypes: count = 11 memoryTypes[0]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None memoryTypes[1]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: color images IMAGE_TILING_LINEAR: None memoryTypes[2]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: FORMAT_D16_UNORM IMAGE_TILING_LINEAR: None memoryTypes[3]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: FORMAT_X8_D24_UNORM_PACK32, FORMAT_D24_UNORM_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[4]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: FORMAT_D32_SFLOAT IMAGE_TILING_LINEAR: None memoryTypes[5]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: FORMAT_D32_SFLOAT_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[6]: heapIndex = 1 propertyFlags = 0x0000: count = 0 None usable for: IMAGE_TILING_OPTIMAL: FORMAT_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[7]: heapIndex = 0 propertyFlags = 0x0001: count = 1 MEMORY_PROPERTY_DEVICE_LOCAL_BIT usable for: IMAGE_TILING_OPTIMAL: color images, FORMAT_D16_UNORM, FORMAT_X8_D24_UNORM_PACK32, FORMAT_D32_SFLOAT, FORMAT_S8_UINT, FORMAT_D24_UNORM_S8_UINT, FORMAT_D32_SFLOAT_S8_UINT IMAGE_TILING_LINEAR: None memoryTypes[8]: heapIndex = 0 propertyFlags = 0x0001: count = 1 MEMORY_PROPERTY_DEVICE_LOCAL_BIT usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None memoryTypes[9]: heapIndex = 1 propertyFlags = 0x0006: count = 2 MEMORY_PROPERTY_HOST_VISIBLE_BIT MEMORY_PROPERTY_HOST_COHERENT_BIT usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None memoryTypes[10]: heapIndex = 1 propertyFlags = 0x000e: count = 3 MEMORY_PROPERTY_HOST_VISIBLE_BIT MEMORY_PROPERTY_HOST_COHERENT_BIT MEMORY_PROPERTY_HOST_CACHED_BIT usable for: IMAGE_TILING_OPTIMAL: None IMAGE_TILING_LINEAR: None VkPhysicalDeviceFeatures: ========================= robustBufferAccess = true fullDrawIndexUint32 = true imageCubeArray = true independentBlend = true geometryShader = true tessellationShader = true sampleRateShading = true dualSrcBlend = true logicOp = true multiDrawIndirect = true drawIndirectFirstInstance = true depthClamp = true depthBiasClamp = true fillModeNonSolid = true depthBounds = true wideLines = true largePoints = true alphaToOne = true multiViewport = true samplerAnisotropy = true textureCompressionETC2 = false textureCompressionASTC_LDR = false textureCompressionBC = true occlusionQueryPrecise = true pipelineStatisticsQuery = true vertexPipelineStoresAndAtomics = true fragmentStoresAndAtomics = true shaderTessellationAndGeometryPointSize = true shaderImageGatherExtended = true shaderStorageImageExtendedFormats = true shaderStorageImageMultisample = true shaderStorageImageReadWithoutFormat = true shaderStorageImageWriteWithoutFormat = true shaderUniformBufferArrayDynamicIndexing = true shaderSampledImageArrayDynamicIndexing = true shaderStorageBufferArrayDynamicIndexing = true shaderStorageImageArrayDynamicIndexing = true shaderClipDistance = true shaderCullDistance = true shaderFloat64 = true shaderInt64 = true shaderInt16 = true shaderResourceResidency = true shaderResourceMinLod = true sparseBinding = true sparseResidencyBuffer = true sparseResidencyImage2D = true sparseResidencyImage3D = true sparseResidency2Samples = true sparseResidency4Samples = true sparseResidency8Samples = true sparseResidency16Samples = true sparseResidencyAliased = true variableMultisampleRate = true inheritedQueries = true VkPhysicalDevice16BitStorageFeatures: ------------------------------------- storageBuffer16BitAccess = true uniformAndStorageBuffer16BitAccess = true storagePushConstant16 = true storageInputOutput16 = false VkPhysicalDevice8BitStorageFeaturesKHR: --------------------------------------- storageBuffer8BitAccess = true uniformAndStorageBuffer8BitAccess = true storagePushConstant8 = true VkPhysicalDeviceBlendOperationAdvancedFeaturesEXT: -------------------------------------------------- advancedBlendCoherentOperations = true VkPhysicalDeviceBufferDeviceAddressFeaturesEXT: ----------------------------------------------- bufferDeviceAddress = true bufferDeviceAddressCaptureReplay = true bufferDeviceAddressMultiDevice = true VkPhysicalDeviceConditionalRenderingFeaturesEXT: ------------------------------------------------ conditionalRendering = true inheritedConditionalRendering = true VkPhysicalDeviceDepthClipEnableFeaturesEXT: ------------------------------------------- depthClipEnable = true VkPhysicalDeviceDescriptorIndexingFeaturesEXT: ---------------------------------------------- shaderInputAttachmentArrayDynamicIndexing = true shaderUniformTexelBufferArrayDynamicIndexing = true shaderStorageTexelBufferArrayDynamicIndexing = true shaderUniformBufferArrayNonUniformIndexing = true shaderSampledImageArrayNonUniformIndexing = true shaderStorageBufferArrayNonUniformIndexing = true shaderStorageImageArrayNonUniformIndexing = true shaderInputAttachmentArrayNonUniformIndexing = true shaderUniformTexelBufferArrayNonUniformIndexing = true shaderStorageTexelBufferArrayNonUniformIndexing = true descriptorBindingUniformBufferUpdateAfterBind = false descriptorBindingSampledImageUpdateAfterBind = true descriptorBindingStorageImageUpdateAfterBind = true descriptorBindingStorageBufferUpdateAfterBind = true descriptorBindingUniformTexelBufferUpdateAfterBind = true descriptorBindingStorageTexelBufferUpdateAfterBind = true descriptorBindingUpdateUnusedWhilePending = true descriptorBindingPartiallyBound = true descriptorBindingVariableDescriptorCount = true runtimeDescriptorArray = true VkPhysicalDeviceFragmentShaderInterlockFeaturesEXT: --------------------------------------------------- fragmentShaderSampleInterlock = true fragmentShaderPixelInterlock = true fragmentShaderShadingRateInterlock = true VkPhysicalDeviceHostQueryResetFeaturesEXT: ------------------------------------------ hostQueryReset = true VkPhysicalDeviceImagelessFramebufferFeaturesKHR: ------------------------------------------------ imagelessFramebuffer = true VkPhysicalDeviceIndexTypeUint8FeaturesEXT: ------------------------------------------ indexTypeUint8 = true VkPhysicalDeviceInlineUniformBlockFeaturesEXT: ---------------------------------------------- inlineUniformBlock = true descriptorBindingInlineUniformBlockUpdateAfterBind = true VkPhysicalDeviceLineRasterizationFeaturesEXT: --------------------------------------------- rectangularLines = true bresenhamLines = true smoothLines = true stippledRectangularLines = true stippledBresenhamLines = true stippledSmoothLines = true VkPhysicalDeviceMemoryPriorityFeaturesEXT: ------------------------------------------ memoryPriority = true VkPhysicalDeviceMultiviewFeatures: ---------------------------------- multiview = true multiviewGeometryShader = true multiviewTessellationShader = true VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR: -------------------------------------------------------- pipelineExecutableInfo = true VkPhysicalDeviceProtectedMemoryFeatures: ---------------------------------------- protectedMemory = false VkPhysicalDeviceSamplerYcbcrConversionFeatures: ----------------------------------------------- samplerYcbcrConversion = true VkPhysicalDeviceScalarBlockLayoutFeaturesEXT: --------------------------------------------- scalarBlockLayout = true VkPhysicalDeviceShaderAtomicInt64FeaturesKHR: --------------------------------------------- shaderBufferInt64Atomics = true shaderSharedInt64Atomics = true VkPhysicalDeviceShaderClockFeaturesKHR: --------------------------------------- shaderSubgroupClock = true shaderDeviceClock = true VkPhysicalDeviceShaderDemoteToHelperInvocationFeaturesEXT: ---------------------------------------------------------- shaderDemoteToHelperInvocation = true VkPhysicalDeviceShaderDrawParametersFeatures: --------------------------------------------- shaderDrawParameters = true VkPhysicalDeviceFloat16Int8FeaturesKHR: --------------------------------------- shaderFloat16 = false shaderInt8 = true VkPhysicalDeviceShaderSubgroupExtendedTypesFeaturesKHR: ------------------------------------------------------- shaderSubgroupExtendedTypes = true VkPhysicalDeviceSubgroupSizeControlFeaturesEXT: ----------------------------------------------- subgroupSizeControl = true computeFullSubgroups = true VkPhysicalDeviceTexelBufferAlignmentFeaturesEXT: ------------------------------------------------ texelBufferAlignment = true VkPhysicalDeviceTimelineSemaphoreFeaturesKHR: --------------------------------------------- timelineSemaphore = true VkPhysicalDeviceTransformFeedbackFeaturesEXT: --------------------------------------------- transformFeedback = true geometryStreams = true VkPhysicalDeviceUniformBufferStandardLayoutFeaturesKHR: ------------------------------------------------------- uniformBufferStandardLayout = true VkPhysicalDeviceVariablePointersFeatures: ----------------------------------------- variablePointersStorageBuffer = true variablePointers = true VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT: -------------------------------------------------- vertexAttributeInstanceRateDivisor = true vertexAttributeInstanceRateZeroDivisor = true VkPhysicalDeviceVulkanMemoryModelFeaturesKHR: --------------------------------------------- vulkanMemoryModel = true vulkanMemoryModelDeviceScope = true vulkanMemoryModelAvailabilityVisibilityChains = true VkPhysicalDeviceYcbcrImageArraysFeaturesEXT: -------------------------------------------- ycbcrImageArrays = true ```
tofuhoard commented 4 years ago

Using Win10 Pro + NVIDIA GTX 970 (latest driver, v. 446.14), I am also getting fully black images with tiny file sizes. I have tried adjusting the settings and upscaling different file types, but the result is always the same. For comparison, srmd-ncnn-vulkan works fine for me, but waifu2x-ncnn-vulkan unfortunately does not (green/blue color issues).

Console example (realsr-ncnn-vulkan): YNhutEtKhS

lextra2 commented 4 years ago

@nihui I get the same garbage noise output as @Felixkruemel Tried jpg and png as input. No difference.

Felixkruemel commented 4 years ago

@lextra2 Which GPU do you have and which OS?

@nihui Can we help you identifying this issue?

nihui commented 4 years ago

https://github.com/nihui/realsr-ncnn-vulkan/actions/runs/122496515

build with latest ncnn, include two critical fix for fp16 storage on integrated gpu

extrimexxx commented 4 years ago

@nihui nvidia gtx1070 realsr-ncnn-vulkan-artifact-windows-2016 fully black images realsr-ncnn-vulkan-artifact-windows-2019 fully black images

Felixkruemel commented 4 years ago

@nihui For me still the same noisy image as in the first post :/

lextra2 commented 4 years ago

@nihui can't test today. will test tomorrow @Felixkruemel my os is w10 1909 my gpu is radeon rx 5700

tofuhoard commented 4 years ago

Using an Nvidia GTX 970, having tried different tile sizes and thread configs:

skaldamramra commented 4 years ago

for me Using Nvidia GTX 1080 8GB:

realsr-ncnn-vulkan-artifact-windows-2016: black image realsr-ncnn-vulkan-artifact-windows-2019: black image

Felixkruemel commented 4 years ago

@nihui has published a fix on ncnn today fix memorydata test on nvidia gpu Maybe that solves the problem on the nvidia side. Although it's unlikely, but possible.

Felixkruemel commented 4 years ago

@nihui I just installed Kubuntu on my rig. It works there with no issues. Seems to be something of with the Windows ports then. Here's a before and a after and let's say it that way: The results are literally shocking good, would be great if the Windows Port would work too. Before sample

After test2

On Windows I still get those black or noisy images.

deadman0713 commented 4 years ago

If it helps any: No problems on AMD RX5700XT w/ThreadRipper 1950x. Win10 19640.

realsr-ncnn-vulkan.exe -s 4 -v -m models-DF2K -j 2:3:3

One oddity I did notice; increasing tile size (300) gives a smaller output filesize when compared to Auto.

tofuhoard commented 4 years ago

Would anyone be willing to upload a Windows build of realsr-ncnn-vulkan that includes @nihui's newly-updated ncnn? I am sadly unable to build it myself and would really like to see if this change fixes the black image problem.

Xyz00777 commented 4 years ago

the problem is the display language... i changed it from german to english and now its working... on 2 systems of me its working now

extrimexxx commented 4 years ago

@nihui @Xyz00777 OMG! But this bug no "Display language" for me, i was changing the settings, no effect, i change "Regional Format" to English and realsr-ncnn-vulkan waifu2x-ncnn-vulkan working without problems image

Xyz00777 commented 4 years ago

did you restart your machine or only logged in again? when only logged in than restart it becuase first after a full system restart it is changed correctly

extrimexxx commented 4 years ago

without restart and relogin, just reopen cmd

Xyz00777 commented 4 years ago

than it cant work. you have to restart the system completly first after that it changed all to english

lextra2 commented 4 years ago

I can confirm. Changing the following setting fixes both realsr noise output aswell as waifu2x missing color channel output.

321

Actually insane to figure this out.

extrimexxx commented 4 years ago

@Xyz00777 my Display language Russian, Regional Format Russian. after reboot pc, i start record video, and run realsr-ncnn-vulkan (black image), after i change Regional Format to English (United States) and again run realsr-ncnn-vulkan (work good (without reboot pc and relogin)), after i change Regional Format to Russian and again run realsr-ncnn-vulkan (black image (without reboot pc and relogin))

i upload my video to youtube.com: https://www.youtube.com/watch?v=uuNw4tuUbIc IMAGE ALT TEXT

lextra2 commented 4 years ago

You don't need to change region format or reboot. see https://github.com/nihui/waifu2x-ncnn-vulkan/issues/59#issuecomment-639564174 for the correct solution

nihui commented 4 years ago

reproduced will work out a fix in ncnn and push a new release Thanks to all

extrimexxx commented 4 years ago

@nihui Work good for me in laters CI Artifacts. Thank You!

nihui commented 4 years ago

https://github.com/nihui/realsr-ncnn-vulkan/releases/tag/20200606

Felixkruemel commented 4 years ago

I think we can close this issue now. The last release should have fixed it. Does anybody still have this issue?

lextra2 commented 4 years ago

Yeah. Issues can be closed. It works now.