Open VL4DST3R opened 1 year ago
All images handled by Kobold Lite are downscaled to 256x256 by design, earliest tests with larger images showed an unacceptable slowdown to the UI and file sizes once the stories started getting longer, especially so because Kobold Lite embeds all images within the story and save automatically. The following customizations are possible, which you can try:
If you want to do additional stuff like customizing cfg scale, changing the sampler from Euler A or number of steps, you'd have to directly modify the payload in the source code for now. No fine grained configs are currently planned.
Yeah I figured the embedding of the image inside the text itself was part of the reason, although the downscaling to 256 (and i presume the very lossy jpeg compression used) would explain the noticeable quality drop, I didn't realize they weren't even 512x512.
1+2. I've already made use of the prefix feature to add a few "quality" tokens, but the subject itself was usually decent, the resulting image resolution/compression quality was my main issue and reason for making this ticket.
Real shame about no plans for a configurable sampler and such, but I understand this is ultimately a gimmick more than anything else. Unless you will revisit this whole topic at a later date, feel free to close this suggestion.
A lot of the quality from SD picture generation comes from High Res Fix, so I would love to see that as a potential option.
Yeah, especially with the new SDXL stuff, it seems to be designed to be a multi-step process to get good results with this technology. That's not to say you don't get anything decent in first pass, but it's clearly not ideal.
Should I force high res fix to be true? Are there any downsides?
Besides slower gen I don't think so. But given your reasoning with not wanting to bloat the file with large images I don't know how much can be achieved this way. Even if we get better images internally, if it still gets crunched down to 256 then you don't really get to enjoy much of it.
Maybe a different solution to store images would be preferred altogether?
EDIT: which actually leads me to something else I noticed: images generated via the API do not get saved on the machine generating them at all? I hoped I could at least retroactively see them at native res in the output folder within SD, but nothing gets saved when generated from the kobold UI.
Yeah, I think that's how A1111 works, but maybe there's a setting to overwrite it.
Indeed it is. Could you maybe expose it within the UI?
I would love to be able to change the resolution. Especially now with sdxl, it be a game changer
I'm fairly new to LLM and kobold, so keep that in mind.
Is it reasonable to save generated images in a folder such as "res" (resources) next to the generated story/log, and the story or log references the image by filename? Users then have the option to transmit the story, or story + resources.
Would you be open to pull requests to extend the customize-ability of the automatic111 API into koboldcpp?
Right now it's already possible to customize step count and cfg scale. In future, I might consider adding a toggle to enable higher resolution, I have thought of a good way to do this.
You can now also save images in "higher res" mode which takes up about double the space. It's not full resolution but it should provide a better compromise.
Could you add the flag to simply also save it in the local A1111 output folder? The one I linked above. This way there would be no need to fiddle with the ui or worry about lower res images.
I've added the flag to Lite as requested. Toggle it in settings.
Though I'd caution against using it on remote servers - images saved remotely this way cannot be deleted by Lite after generation - this means that whoever is running the A1111 server will have persistent access to your previously generated images. If you are using a cloud service like runpod or colab, they will have your generated images written to disk too.
Keeping the image in your own stories is safer, you can delete them anytime, and save them to your device at will.
Keeping the image in your own stories is safer, you can delete them anytime, and save them to your device at will.
I know, but when you're hosting both locally it seems like needless tedium. Thanks for adding it!
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Is it possible to have a bit more control over the generated images via the A1111 API? At the very least to change the resolution or enable some form of upscaling as the images generated are very grainy and low resolution currently. They are good as a gimmick/a bit of flavor to sprinkle in the text, but not really worth "viewing" beyond the thumbnail.
Current Behavior
Besides prefixing the prompt taken from your chat context and changing the model used, you cannot do much else.