dribnet / pixray

neural image generation
Other
402 stars 53 forks source link

Passing in init_image to pixeldrawer.py yields different color_vars values #37

Open seang opened 2 years ago

seang commented 2 years ago

Hi! We have a use case where we're doing some iteration, exporting the image and then reloading the image at at later time.

What we're finding so far is that the color_vars value in the pixeldrawer class is not the same which is preventing our second run from converging in the expected amount of time.

we ran this block first:

python pixray.py \
  --drawer=pixel \
  --prompt="castle on a mountain #pixelart" \
  --iterations 10 \
  --smoothness_type log \
  --smoothness_weight 2.0 \
  --output myfile.png \
  --saturation 0.8 \
  --quality better \
  --scale 2.5 \
  -nps  1 2 3 \
  --output "pixel-drawer-debug.png" 

Which produced pixel-drawer-debug.png as expected, with the pixeldrawer instance having a color_vars[:10] value of:

[tensor([0.9249, 0.7148, 0.2696, 1.0000], requires_grad=True), tensor([0.7613, 0.6343, 0.3088, 1.0000], requires_grad=True), tensor([0.2164, 0.4593, 0.6111, 1.0000], requires_grad=True), tensor([0.5734, 0.3440, 0.7936, 1.0000], requires_grad=True), tensor([0.5893, 0.4971, 0.7856, 1.0000], requires_grad=True), tensor([0.6815, 0.6320, 0.7613, 1.0000], requires_grad=True), tensor([0.8446, 0.4792, 0.7452, 1.0000], requires_grad=True), tensor([0.2816, 0.7093, 0.5287, 1.0000], requires_grad=True), tensor([0.0463, 0.6751, 0.2235, 1.0000], requires_grad=True), tensor([0.2542, 0.8179, 0.2391, 1.0000], requires_grad=True)]

Then we passed the image output hosted on a site as the init_image url with this command:

python pixray.py \
  --drawer=pixel \
  --prompt="castle on a mountain #pixelart" \
  --iterations 10 \
  --smoothness_type log \
  --smoothness_weight 2.0 \
  --output myfile.png \
  --saturation 0.8 \
  --quality better \
  --scale 2.5 \
  -nps  1 2 3 \
  --output "pixel-drawer-debug.png" \
  --init_image "https://samplesite.com/pixel-drawer-debug-10.png"

Which resulted in the pixeldrawer instance having a startup value color_vars[:10] of:

[tensor([0.8071, 0.6282, 0.3669, 1.0000], requires_grad=True), tensor([0.6882, 0.5743, 0.4316, 1.0000], requires_grad=True), tensor([0.2797, 0.4941, 0.6301, 1.0000], requires_grad=True), tensor([0.5674, 0.4051, 0.7355, 1.0000], requires_grad=True), tensor([0.5311, 0.5272, 0.7583, 1.0000], requires_grad=True), tensor([0.6078, 0.6020, 0.7250, 1.0000], requires_grad=True), tensor([0.7248, 0.4282, 0.7404, 1.0000], requires_grad=True), tensor([0.2821, 0.6142, 0.5637, 1.0000], requires_grad=True), tensor([0.1105, 0.6468, 0.2429, 1.0000], requires_grad=True), tensor([0.2507, 0.8100, 0.2363, 1.0000], requires_grad=True)]

( After init_from_tensor is called )

We're not really sure why but if you have any ideas they'd be appreciated

dribnet commented 2 years ago

Thanks for this detailed report. By default in pixray the init_image is mixed with noise.

python pixray.py \
  --iterations 0 \
  --init_image inputs/test_pattern_wide2.jpg \
  --output outputs/debug/test_pattern_wide2_A.jpg

test_pattern_wide2_A

To prevent this you would need to also set init_image_alpha explicitly to either 255 (force opaque) or 0 (leave alpha untouched - useful if the init-image actually has an alpha channel).

python pixray.py \
  --iterations 0 \
  --init_image inputs/test_pattern_wide2.jpg \
  --init_image_alpha 255 \
  --output outputs/debug/test_pattern_wide2_B.png

test_pattern_wide2_B

Can you try adding init_image_alpha=255 and reporting if this is closer to the expected value? It's entirely possible that there will still be some discrepancy as there might not be perfect alignment between the init_image resizing and the pixeldrawer sampling on the image across the square - but if so this can be investigated.