cooljeanius / ANO_art

Artwork for the Wesnoth campaign "A New Order"; includes material not included in final add-on
https://forums.wesnoth.org/viewtopic.php?f=23&t=36074
Other
1 stars 0 forks source link

Use art in this repo as training data for locally-controlled AI-generated art #3

Open cooljeanius opened 1 year ago

cooljeanius commented 1 year ago

Finding good artists who can match the style of existing art is tough. However, AI has been getting better at doing art lately. For mainline Wesnoth and IftU/AtS, @kabachuha has been training Stable Diffusion to create new artwork that's similar to existing material; maybe he could help me do something similar here?

kabachuha commented 1 year ago

Sure, thanks for tagging! I'll definitely add this to the dataset.

cooljeanius commented 1 year ago

Sure, thanks for tagging! I'll definitely add this to the dataset.

OK cool, let me know if anything interesting comes of it!

kabachuha commented 1 year ago

I'll replicate my message from discord here too

I've been quite busy with porting an animations script, so I haven't utilized the dataset yet, but meanwhile you can streamline the process by splitting it in the following file structure:
dataset -- object -- <object1_name> -- 1.png, 2.png, etc
               -- object -- <object2_name> -- 1.png, 2.png, etc
               -- style --- <artist1_name> -- 1.png, 2.png, etc
...

and also I'll write some upscaling instructions for the pics less than 512*512
if it's a pixelart sprite, use image -> size -> 512512 -> interpolation = None

if portrait, changeonly the canvas size* to 512512 and then move it to the border so it won't seem cropped

if large scale art, crop a 512512 fragment. If too large, downscale with interpolation = cubic

All these modes are available in Gimp

This repo is an absolute gold mine for wesnoth art and it'll be great to use it!

cooljeanius commented 1 year ago

so, just to clarify, by "file structure", do you mean reorganize the directory layout so the files are in order like that? Or did you mean to create a text file where I write all that up?

cooljeanius commented 1 year ago

What file formats does SD support, btw? Does it only work with the final .pngs/.jpgs, or is there a way to get it to use the .xcf projects (for both input/output) so that it gives me editable layers?

kabachuha commented 1 year ago

Since the init images are being loaded with PIL, it supports a variety of formats including .png and .jpg, but not .xcf (it's GIMP's project format, not an image in the usual way, but it's quite plausible that there're external scripts exporting the pictures from the files) https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html

Also, you note that Stable Diffusion works only in RGB for simplicity and easier debug, i.e. no alpha-channel and no transparency. After your images are produced, you will have to remove their background either manually or with another NN like https://huggingface.co/spaces/ECCV2022/dis-background-removal or https://huggingface.co/spaces/skytnt/anime-remove-background. There're similar webui scripts automating the process like depth2img

kabachuha commented 1 year ago

I'll replicate my message from discord here too

I've been quite busy with porting an animations script, so I haven't utilized the dataset yet, but meanwhile you can streamline the process by splitting it in the following file structure:
dataset -- object -- <object1_name> -- 1.png, 2.png, etc
               -- object -- <object2_name> -- 1.png, 2.png, etc
               -- style --- <artist1_name> -- 1.png, 2.png, etc
...

and also I'll write some upscaling instructions for the pics less than 512*512
if it's a pixelart sprite, use image -> size -> 512512 -> interpolation = None

if portrait, changeonly the canvas size* to 512512 and then move it to the border so it won't seem cropped

if large scale art, crop a 512512 fragment. If too large, downscale with interpolation = cubic

All these modes are available in Gimp

This repo is an absolute gold mine for wesnoth art and it'll be great to use it!

Again, updating from Discord. Now the file structure for finetuning is more easier, you just need to put all characters and styles in their respective folders and make a json file of the following format

[
    {
        "instance_prompt":      "photo of zwx dog",
        "class_prompt":         "photo of a dog",
        "instance_data_dir":    "../../../data/alvan",
        "class_data_dir":       "../../../data/dog"
    }
]

You don't have to create class dirs, they're made and filled with class images on their own when the script is launched!


An example how it may look like for this repo:

[
    {
        "instance_prompt":      "character portrait of gawen",
        "class_prompt":         "character portrait of a fantasy warrior",
        "instance_data_dir":    "../../../data/ANO_art/iborra/gawen",
        "class_data_dir":       "../../../data/class_gen/iborra/gawen"
    },
    {
        "instance_prompt":      "character portrait of karen",
        "class_prompt":         "character portrait of a fantasy warrior woman",
        "instance_data_dir":    "../../../data/ANO_art/iborra/karen",
        "class_data_dir":       "../../../data/class_gen/iborra/karen"
    },
    {
        "instance_prompt":      "pixelart sprite in the style of sylar",
        "class_prompt":         "pixelart sprite",
        "instance_data_dir":    "../../../data/ANO_art/sylar",
        "class_data_dir":       "../../../data/class_gen/sylar"
    }
]
cooljeanius commented 1 year ago

Thanks, I'll see if I can get around to it

kabachuha commented 1 year ago

Yet, I'd advocate for some more work before using this repo as it — as always in the case of Wesnoth, it has sprite/art disbalance, if you feed it too much similar images, you may get retraining, i.e. the network will memorize the samples

Look, for example, at the updated Elynia dataset. It has only three images now — 1 portrait and 2 pixelarts. As you saw on Discord, it resulted in pretty good results, so try sticking to this proportion (it may require moving images around though)

https://drive.google.com/drive/folders/1-sa5eQ9ZgoW0hGu-jYhzIJs0qPpzeFLg?usp=share_link

Don't forget what the pixelart is best to be upscaled with none (nearest neighbour) interpolations and the portraits are the best to leave as is (resizing the canvas to 512*512 without resizing the images) — this way the quality and similarity loss should be the lowest

cooljeanius commented 1 year ago

when writing the json file, does pluralization matter in the prompts? How about case-sensitivity?

kabachuha commented 1 year ago

they are all lowered before the text encoder model (as it was originally from an image captioning model, not a text generating one), so it doesn't matter. Pluralization does matter, though, so 'bee' differs from 'bees'

cooljeanius commented 1 year ago

btw, as for stuff to use this as training data for, I'd point to the ART_TODO.txt file in the main ANO repo for the list of portraits that still need to be done: https://github.com/cooljeanius/A_New_Order/blob/master/ART_TODO.txt

cooljeanius commented 8 months ago

(Updated title to clarify that other entities scraping this repo for their own purposes doesn't count; the idea here is to be GPL-compatible, which means caring about licenses and copyleft and ownership and sourcing)

cooljeanius commented 8 months ago

Directory for putting them in is now checked-in to the repository as of 281aa6c.