Closed nagadomi closed 1 year ago
This seems to be fixed already. Maybe due to some change of ONNX exporter on PyTorch side or updating the hosted onnx runtime
I have tried with other images, and still have problems
About AlphaBorderPadding AlphaBorderPadding pads RGB values of fully transparent areas (alpha channel=0). Even fully transparent areas have RGB values. But it is generally never displayed to users, so it has unstable values, such as fixed values, color palettes, signatures, uninitialized values, etc. So convolving an image without AlphaBorderPadding may cause artifacts on the alpha border.
Examples: input rgb padded
input rgb padded
input rgb
padded
copy from https://github.com/nagadomi/nunif/issues/36#issuecomment-1537748115
And there still seems to be a problem. The output of the Blender logo is as follows in unlimited:waifu2x. There is a strange strong red line. (ignore reflection padding).
pytorch version output
Change script.js
as follows to show the result of AlphaBorderPadding.
// create temporary canvas for tile input
- image_data = this.to_image_data(x.data, alpha3.data, x.dims[3], x.dims[2]);
+ image_data = this.to_image_data(x.data, null, x.dims[3], x.dims[2]);
var input_canvas = document.createElement("canvas");
input_canvas.width = w;
input_canvas.height = h;
var input_ctx = input_canvas.getContext("2d", {willReadFrequently: true});
input_ctx.putImageData(image_data, 0, 0);
+ document.body.appendChild(input_canvas);
var all_blocks = p.h_blocks * p.w_blocks;
// tiled rendering
Additionally, using the Python version of onnxruntime, it shows correctly.
I change the file name and offset=8 in https://github.com/nagadomi/nunif/blob/d5ede7b19d57528c4e01c78ca8459a181edcd825/nunif/models/onnx_helper_models.py#L310-L347 and run it.
Additionally, using the Python version of onnxruntime, it shows correctly.
I change the file name and offset=8 in
and run it.
What if you use the CPU execution provider from Python?
I tried CPUExecutionProvider
and it works correctly.
This problem can have two major causes.
The hard part of this problem is that I don't know what is happening in Python->TorchScript->ONNX
process.
The easiest solution is to reimplement AlphaBorderPadding in pure JavaScript.
The easiest solution is to reimplement AlphaBorderPadding in pure JavaScript.
I actually just finished doing this and it looks like the problem might be something like canvas using premultiplied alpha internally
This happens even if I use an image with this rgb:
The image is here:
And bleed output:
See these artifacts on the rgb input when it is not bled I am not sure where these come from, but if I import an output (with no model or padding or bleeding) into an image editor you can see dark edges indicative of premultiplication which is probably lowering precision and if I removed the alpha channel in my image editor you can see it looks pretty premultiplied
there are parameters that can be used to combat this I will run some experiments
WebGL can be used to avoid the forced premultiplication without using a PNG decoder. https://stackoverflow.com/a/60564905
I will try to implement this solution and report my results
It is working. Here is the code to use webgl1 (not webgl2, which returns null on my browser) to read pixels data from an image
const gl = new OffscreenCanvas(0, 0).getContext('webgl') // this can be reused arbitrarily many times
gl.activeTexture(gl.TEXTURE0)
const texture = gl.createTexture()
gl.bindTexture(gl.TEXTURE_2D, texture)
gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA, gl.UNSIGNED_BYTE, bitmap)
const framebuffer = gl.createFramebuffer()
gl.bindFramebuffer(gl.FRAMEBUFFER, framebuffer)
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, texture, 0)
const imageData = new ImageData(width, height)
gl.readPixels(0, 0, width, height, gl.RGBA, gl.UNSIGNED_BYTE, imageData.data)
gl.deleteTexture(texture)
gl.deleteFramebuffer(framebuffer)
If putImageData
is the cause, then implementing crop(x, y, width, height)
for Float32Array
without creating a temporary canvas will work fine.
However, if the pixel values are already broken at
https://github.com/nagadomi/nunif/blob/d5ede7b19d57528c4e01c78ca8459a181edcd825/waifu2x/unlimited_waifu2x/public_html/script.js#L640-L651
, then PNG decoder without Canvas is needed.
If
putImageData
is the cause, then implementingcrop(x, y, width, height)
forFloat32Array
without creating a temporary canvas will work fine. However, if the pixel values are already broken at, then PNG decoder without Canvas is needed.
See the reply I just made. WebGL1 works. But you need to be able to process the input data without tripping through canvas. This is only possible currently within my rewritten codebase that uses ImageBitmap and ImageData.
ImageBitmaps can be safely manipulated and cropped using createImageBitmap
as long as you specify the options {premultiplyAlpha: 'none'}
For displaying to canvas (eg. previewing inputted image in the src canvas) you can efficiently use the bitmaprenderer context and create a new premultiplied bitmap to display there.
inputCanvas.getContext('bitmaprenderer')!.transferFromImageBitmap(await createImageBitmap(imageBitmap, {premultiplyAlpha: 'premultiply'}))```
By the way, some images, even with perfect pixel reading, will behave horribly with edge bleeding.
Take this image for example, which I have seved to my computer:
Edge bleeding results in:
Because, if you look at the raw RGB data in an image editor, it really does have bad pixels in the data:
Even though it is not visible whatsoever in the source:
The solution is, instead of using all pixels with an alpha above 0, define the boundary to be something like 0.5.
Then the result is much nicer:
A threshold of 0.1 is perhaps too low:
A theshold of 0.25 is:
0.2:
Not sure which one I like more or whether it should be adjustable, maybe I will make it adjustable. I think 0.5 is good but it might be too aggressive for some images, I am not sure.
Thank you. I understand the cause of this problem and some solutions.
The solution is, instead of using all pixels with an alpha above 0, define the boundary to be something like 0.5.
It may depend on the implementation, but changing pixel values other than alpha 0 will affect the visibility of the image. If it is just for reference, no problem. In a face image dataset I saw, it applied blur filter to the padded pixels to prevent jaggedness. (appling blur filter to the entire image and then mask it with alpha==0.)
changing pixel values other than alpha 0 will affect the visibility of the image
This is true, but so will upscaling itself.
Maybe it should be adjustable.
Sometimes the quality of the alpha channel is not good enough, so it might be useful to be able to fix it, but it is not a problem with waifu2x. What does not work is the case where the alpha channel of the entire image is such that alpha = 0.1.
What does not work is the case where the alpha channel of the entire image is such that alpha = 0.1.
Yes, indeed. User adjustability would be needed in that case.
I found a way to avoid the premultiplication on saving too.
upscaled result:
alpha channel removed (except for transparent black):
(ignore the missing spots - I do edge bleeding on individual tiles to reduce memory pressure, I never convert the entire image to a tensor)
job summary:
―――― unlimited:waifu2x job completed ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
· Input: 560x560 (313600px)
· Output: 1120x1120 (1254400px)
· Model: cunet.art
· Denoise: 3
· Scale: 2
· Tile size: 64
· TTA level: 0
· Alpha: true
· Threads: 12
→ run: 8860.85ms
→ run.canvas: 0.80ms
→ run.canvas.getContext: 0.00ms
→ run.canvas.clear: 0.00ms
→ run.canvas.backdrop: 0.79ms
→ run.glReader: 13.96ms
→ run.glReader.input: 5.32ms
→ run.glReader.output: 8.59ms
→ run.tiledRender: 8845.63ms
→ run.tiledRender.collectTiles: 8835.88ms
→ run.tiledRender.collectTiles.calculateTotalPixels: 0.03ms
→ run.tiledRender.collectTiles.newScratch: 0.02ms
→ run.tiledRender.collectTiles.tile: 81 occurrences; min 75.67ms, max 331.64ms, avg 109.08ms, total 8835.56ms
→ run.tiledRender.collectTiles.tile.pick: 81 occurrences; min 0.00ms, max 0.01ms, avg 0.01ms, total 0.54ms
→ run.tiledRender.collectTiles.tile.calculateTileMetrics: 81 occurrences; min 0.00ms, max 0.07ms, avg 0.00ms, total 0.35ms
→ run.tiledRender.collectTiles.tile.reportCallbackStarted: 81 occurrences; min 0.01ms, max 0.23ms, avg 0.02ms, total 1.77ms
→ run.tiledRender.collectTiles.tile.indicator: 81 occurrences; min 0.00ms, max 0.04ms, avg 0.01ms, total 0.71ms
→ run.tiledRender.collectTiles.tile.capture: 81 occurrences; min 1.10ms, max 24.37ms, avg 3.86ms, total 312.46ms
→ run.tiledRender.collectTiles.tile.process: 81 occurrences; min 72.36ms, max 326.75ms, avg 104.85ms, total 8492.89ms
→ run.tiledRender.collectTiles.tile.process.toRgbAlpha: 81 occurrences; min 0.06ms, max 0.83ms, avg 0.14ms, total 11.48ms
→ run.tiledRender.collectTiles.tile.process.bleedEdges: 81 occurrences; min 0.04ms, max 4.04ms, avg 0.38ms, total 31.05ms
→ run.tiledRender.collectTiles.tile.process.stretchAlpha: 81 occurrences; min 0.01ms, max 0.33ms, avg 0.05ms, total 3.76ms
→ run.tiledRender.collectTiles.tile.process.pad: 81 occurrences; min 0.57ms, max 10.12ms, avg 1.48ms, total 120.03ms
→ run.tiledRender.collectTiles.tile.process.pad.rgb: 81 occurrences; min 0.32ms, max 9.82ms, avg 1.07ms, total 86.34ms
→ run.tiledRender.collectTiles.tile.process.pad.alpha: 81 occurrences; min 0.20ms, max 3.49ms, avg 0.40ms, total 32.75ms
→ run.tiledRender.collectTiles.tile.process.model: 81 occurrences; min 70.40ms, max 306.33ms, avg 102.46ms, total 8299.40ms
→ run.tiledRender.collectTiles.tile.process.model.batch: 81 occurrences; min 0.03ms, max 0.17ms, avg 0.09ms, total 7.29ms
→ run.tiledRender.collectTiles.tile.process.model.run: 81 occurrences; min 70.13ms, max 306.07ms, avg 102.21ms, total 8279.30ms
→ run.tiledRender.collectTiles.tile.process.model.unbatch: 81 occurrences; min 0.05ms, max 0.50ms, avg 0.14ms, total 11.65ms
→ run.tiledRender.collectTiles.tile.process.rgbToImageData: 81 occurrences; min 0.17ms, max 5.31ms, avg 0.32ms, total 25.86ms
→ run.tiledRender.collectTiles.tile.crop: 81 occurrences; min 0.04ms, max 0.23ms, avg 0.06ms, total 5.03ms
→ run.tiledRender.collectTiles.tile.writePixels: 81 occurrences; min 0.02ms, max 0.10ms, avg 0.07ms, total 5.29ms
→ run.tiledRender.collectTiles.tile.clearRect: 81 occurrences; min 0.00ms, max 0.06ms, avg 0.01ms, total 1.05ms
→ run.tiledRender.collectTiles.tile.drawImage: 81 occurrences; min 0.07ms, max 0.61ms, avg 0.09ms, total 7.44ms
→ run.tiledRender.collectTiles.tile.updateCounters: 81 occurrences; min 0.00ms, max 0.01ms, avg 0.00ms, total 0.10ms
→ run.tiledRender.collectTiles.tile.reportCallbackCompleted: 81 occurrences; min 0.06ms, max 0.21ms, avg 0.07ms, total 5.61ms
→ run.tiledRender.readPixels: 7.74ms
→ run.tiledRender.outputBitmap: 1.99ms
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
Here is the result from official unlimited:waifu2x:
this is because it is saving the premultiplied data in the output canvas instead of using a hack to smuggle non-premultiplied data into the canvas. This is actually possible using bitmaprenderer
context, you can upload a non-premultiplied ImageBitmap
to the canvas, and convert it to a blob, and it will be completely lossless.
I use an offscreen canvas for this because you can't mix multiple contexts in a canvas, but also because uploading the non-premultiplied data has an undesirable visual effect as well. Best only to use it for downloads.
I wonder how many online image processors actually do not take into account the alpha premultiplication, it must be a lot. It's very difficult to avoid. I might be the only person who cares
Fixed by https://github.com/nagadomi/nunif/commit/2bbdef93fab6ed5e42ad84c37f77e41ea4698f7c ← https://github.com/nagadomi/nunif/commit/55ce24be1408da35c9b64aa4c5278d75950a1797 ← https://github.com/nagadomi/nunif/commit/8136f645ed4ce905d26ec7662292c703183934ff Thank you for all your help.
I changed the top input image area from Canvas to HTMLImageElement and decoded img with WebGL when upscaling is executed. In tiled_render, tile image is cropped from tensor without using temporary Canvas.
alpha_border_padding(related to the handling of transparent PNG) is not fully working on unlimited:waifu2x. ONNX model output is different between Python and JavaScript version, resulting in jagged border with transparent area.
related: https://github.com/nagadomi/waifu2x/issues/197#issuecomment-1531036741 to several comments