OpenAI Image Generation Tweaks

10up / classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence.

GNU General Public License v2.0

577 stars 53 forks source link

Is your enhancement related to a problem? Please describe.

The following tweaks to our OpenAI image generation functionality would help surface the feature a bit better in the editor as well as make the user flow a bit more akin to the existing Media Library experience within the editor.

1.) Like the Media Library tab aligns the Filter media and All dates items with the Upload files horizontal border, it would be great to do the same in the Generate images tab with the Enter a prompt..., Once images..., and Enter prompt items.

2.) Change the header text of Select or Upload Media to Select, Upload, or Generate Media:

3.) Like the Select or Upload Media modal has options for Upload files, Media Library, and Generate images, let's update the core image block to have Generate image alongside the current Upload, Media Library, Insert from URL options that would automatically deep-link into the Generate images modal tab.

4.) Update the Enter prompt text field to be a larger text area so that lengthy prompt inputs can display more (most?) of the prompt before a user clicks the Generate images button.

5.) After an image(s) are generated, maintain the prompt input in the Enter prompt text area in case a user wants to generate different images by tweaking the prompt (e.g. they don't like the options generated and want more to select from). [additionally, if there's an ability in the API to create variants from a specific image result or upscale a certain image, then we should explore that via a separate feature enhancement issue]

6.) Regardless of what size image the user has set ClassifAI to generate in the plugin settings, let's only render smaller thumbnail sizes for the results so that however many are returned from OpenAI will easily display alongside eachother (and respective action buttons/links) versus now showing very large images and having to scroll significantly to view all those options.

Designs

Screenshots of existing areas for iteration are above, if any suggestions are unclear then let me know and I can hack together samples from those screenshots to try and visually express updates.

Describe alternatives you've considered

n/a

Code of Conduct

[X] I agree to follow this project's Code of Conduct

1.) Like the Media Library tab aligns the Filter media and All dates items with the Upload files horizontal border, it would be great to do the same in the Generate images tab with the Enter a prompt..., Once images..., and Enter prompt items.

This was already done but found a bug where the CSS we rely on was only included if the IBM Watson feature was on. I've fixed that now in #441.

2.) Change the header text of Select or Upload Media to Select, Upload, or Generate Media:

I agree this would be nice but from what I can tell, there's not a clean way to modify this text. It appears this text comes from the Gutenberg MediaUpload component itself (see https://github.com/WordPress/gutenberg/blob/trunk/packages/media-utils/src/components/media-upload/index.js#L233) and I don't see any filter in place that we can use to change that text. I know we can target that with JS and modify it, though the downside there is that flash of text changing from one thing to the other. Maybe someone else will have more knowledge on if this component can easily be modified to change that title text

3.) Like the Select or Upload Media modal has options for Upload files, Media Library, and Generate images, let's update the core image block to have Generate image alongside the current Upload, Media Library, Insert from URL options that would automatically deep-link into the Generate images modal tab.

I also really like this idea, just not sure how hard it will be to achieve. I'd suggest we open this as a separate issue to investigate (perfect opportunity for someone wanting to dive more into Gutenberg).

4.) Update the Enter prompt text field to be a larger text area so that lengthy prompt inputs can display more (most?) of the prompt before a user clicks the Generate images button.

This has been changed to a textarea in #441

5.) After an image(s) are generated, maintain the prompt input in the Enter prompt text area in case a user wants to generate different images by tweaking the prompt (e.g. they don't like the options generated and want more to select from)

This is also taken care of in #441

[additionally, if there's an ability in the API to create variants from a specific image result or upscale a certain image, then we should explore that via a separate feature enhancement issue]

There is both an image edit API and image variation API that would be awesome to figure out the best way to integrate those here. I'd suggest those as a separate issue and would be ideal to get some design/UX feedback on how best to trigger that. As far as upscaling goes, no API for that and images have to be either 1024x1024, 512x512 or 256x256 and that size is currently chosen in the settings. We could look to add an inline option allowing you to upscale or downscale to one of those options if we think that would be useful

6.) Regardless of what size image the user has set ClassifAI to generate in the plugin settings, let's only render smaller thumbnail sizes for the results so that however many are returned from OpenAI will easily display alongside eachother (and respective action buttons/links) versus now showing very large images and having to scroll significantly to view all those options.

This is the same as the first point here. We had styling in place for this but the CSS wasn't loading if IBM Watson wasn't enabled. I have tweaked the styling a bit in #441 but I think this is fairly decent now. I wanted the images to be big enough that you can easily see what the image looks like but without them taking up the entire screen. Right now it's typically 4 images in a row, though does depend on your screen size.

10up / classifai