vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.31k stars 380 forks source link

[Feature]: Interest Query: Info Cards #2685

Open midcoastal opened 6 months ago

midcoastal commented 6 months ago

Feature description

I am interested in refactoring Info Cards for Networks. Namely:

This is a BUNCH of ideas, just wanted to see if any of them land before I start working on them.

midcoastal commented 6 months ago

If there is any interest in these improvements, let me know in which.

Info Card decoupling (rendering a javascript template from either an API call, or at the least, a separate javascript data-structure) is baseline (I'm doing that regardless). But the other stuff I am interested to hear what people think, specifically.

vladmandic commented 6 months ago

i'm all for optimizations, but a lot of those things have been done a while back, sdnext does not pull entire info into html. and things like tags are there.

for example, this is one entire card:

<div class="card" onclick="return cardClicked(&quot; <lora:Storyboard_sketch:1.0>&quot;, false)" title="sdxl/Storyboard_sketch" data-tab="txt2img" data-page="lora" data-name="sdxl/Storyboard_sketch"
    data-filename="/mnt/d/Models/Lora/sdxl/Storyboard_sketch.safetensors" data-tags="storyboard sketch style|storyboard sketch of" data-mtime="1699047807.4054508" data-size="228458068"
    data-search="/mnt/d/Models/Lora/sdxl/Storyboard_sketch.safetensors storyboard sketch style storyboard sketch of style sketch storyboarding storyboard">
    <div class="overlay">
        <div class="tags"><span class="tag">storyboard sketch style</span><span class="tag">storyboard sketch of</span></div>
        <div class="name">Storyboard sketch</div>
    </div>
    <div class="actions">
        <span class="details" title="Get details" onclick="showCardDetails(event)">🛈</span>
        <div class="additional">
            <ul></ul>
        </div>
    </div>
    <img class="preview" src="./sd_extra_networks/thumb?filename=/mnt/d/Models/Lora/sdxl/Storyboard_sketch.jpeg&amp;mtime=1699049018.2253332" style="width: 160px; height: 160px; object-fit: cover"
        loading="lazy">
</div>

now, relating to multiple images per lora - i don't want to go into carusel style displays and all that, it becomes too much. to enable that "simple" thing, its basically a rip-and-replace of entire current implementation.

and regarding looking up details from model's parent? again, going too far. extra networks is not replacement for entire civitai.

but anything specific you have in mind, i'm more than wiling to hear out.

midcoastal commented 6 months ago

i'm all for optimizations, but a lot of those things have been done a while back, sdnext does not pull entire info into html. and things like tags are there.

I did notice this: The UI pulls the description asynchronously. That was nice to find. However, it is of minimal help:

now, relating to multiple images per lora - i don't want to go into carusel style displays and all that, it becomes too much. to enable that "simple" thing, its basically a rip-and-replace of entire current implementation.

The problem is, not all images are full representations of the capabilities of a particular model. I already collect all images of CivitAI models, and I think my pattern would be a good fit:

I should note that I am not strictly talking about Lora/Lycos, but all extras that we may/could get CivitAI data for (Checkpoints, Embeddings, even Hypernetworks and VAEs).

and regarding looking up details from model's parent? again, going too far. extra networks is not replacement for entire civitai.

See my first section for pitfalls of "not" getting the "parent" model information. If we just focus on the 'description' alone, and decide "ok, maybe it would be beneficial to get the 'parent' model info, just to have a better description and not have the work we did to integrate the CivitAI description not be a wasted effort 80% of the time," then, since we will have the parent data, may as well put it to use, no?

So, if we have it, I wanted to put it to use. ;-)

Additionally, one can not guarantee the life-span of models on CivitAI, or CivitAI it's self. I have several (thousand, actually) models that are no longer on CivitAI. One could argue that "if CivitAI removed it, that's probably a hint," but that isn't always the case, there have been many instances of Authors just deleting their entire catalogs. Yes, some were for "good" reasons, but there are also very many cases where the Author was making a knee-jerk, some times even out of spite. There are also MANY cases where Authors remove one file to replace it with a new version, or some other reason. At which point we are SOL.

This is a personal collection, and we already HAVE most of the information. I mean, let's do something with it, either that, or prove a point and refactor the CivitAI Info downloader to only save the description. But that just sounds silly, yeah?

but anything specific you have in mind, i'm more than wiling to hear out.

These are the specific things, There are just several, specific, things.

To illustrate how my brain worked through this...

  1. I have all of my models in folders, and even if I didn't, the folders list takes up too much space, let's have a way to minimize it.
  2. If we have a way to minimize it, it will probably be an icon or select of some kind. Cool.
  3. It would be helpful to also have more images in the pop up, and we already have a list of them (and I already have them), so that should be straight forward.
  4. Since we have that list, it would be cool to have those have a pop up with generation data.
  5. Oh, also, the descriptions are all empty, that sucks, let's extend that to the root description.
  6. If we have the root data, it would be nice to use the root title on the pop up, too.
  7. OH, showing the categories, license, use, etc, would also be nice!
  8. Oh hey, we have that minimizing folder list, maybe make that selector have an option to filter by those tags, categories, nsfw, poi, even rating! It's all there...
  9. Hmm, I see that some of that is in the html for the cards... But we're adding a lot, here...
  10. We're probably going to need to have all this extra data separate...
  11. Maybe we query for the individual model data, that way, all we have to store is the Model identifier ( /mnt/d/Models/Lora/sdxl/Storyboard_sketch.safetensors in your example) and then we query something like info?model=/mnt/d/Models/Lora/sdxl/Storyboard_sketch.safetensors (to be honest, that was my first solution to the preview images issue this august, an api like preview?model=/mnt/d/Models/Lora/sdxl/Storyboard_sketch.safetensors that, if not already resolved, looked for the preview image, made it, and then served it, completely skipping the step in startup).
  12. Hmmm, if we do that, search would have to be server-side, and return a list of model IDs to show.
  13. Well, that would also allow searching against literally anything, even generation parameters for images (but that may be a bit too much).
  14. Oh, you know what would ALSO be cool? Grouping by that root!
  15. Hmmm, now we are back to needing more data on the front-end, which will make it HUGE... How about this, forget making and returning all that HTML. A little piece of JS that just straight up queries the search feature, with a group option. Forget the whole Folder thing, make that a grouping option.
  16. The search endpoint will return an array, the items will be objects with a name, a few other things if needed (I'll look) and if they have a children entry, when we click them we traverse in, just like folders now, heck yeah! Quick onload times, AND feature-rich.

It's a rabbit-hole, for sure. But hopefully that gives an idea of where my head is at.

I should be clear, I am not asking you to do this work. I am offering. Yeah, I could write a plugin that guts the EN Pages and does what I want, but then we are in a SUPER world of pain, duplicating calculations, diverging, stepping on toes, etc. May as well do it here, yeah?

At this time, my order of attack would be:

  1. Extend the CivitAI Info downloader to "merge" data, if info file already exists (user edits, and any additional "root" model information, will be in unique fields, preserve anything that is not explicitly fetched from CivitAI, including pre-existing related image files if they exist [not agreeing on any of this at this time, just establishing and implementing a merge-strategy to preserve data]).
  2. Add HTML sanitization to the CivitAI downloader: convert HTML to markdown for security (we then have control over what kind of HTML is inserted in DOM, I would also implement this forward mechanism as well). This may require using a different key for any sanitized data, to track if it has already been sanitized (new download) or not (user downloaded prior to implementation).
  3. Update resolution for description to look for, in order userDescription, description.
  4. Update description save to save into userDescription.
  5. Update description editor in UI to be a pop-up. userDescription should/would contain markdown, which is rendered by JS, and the editor will display this markdown. Saving will re-render the description in the UI.
  6. Make the basic list endpoint.
  7. Refactor EN pages such that, on first display, they query that endpoint and render the preview cards as they currently do, just using a JS template.
  8. Refactor the card pop-up:
    • On-tigger, query for additional information, and if present (note that this is preparation, I will not have updated the CivitAI downloader to gather additional information, yet), display:
      • {model.name}: {name}
      • model.rating
      • model.description - (in addition to description, of which both could be editable, or just description, as that is the Model Version description, undecided)
      • model.tags
      • model.trainedWords
      • model.trainingDetails.type
      • Indicators for model.nsfw, model.poi
      • Potentially indicators for model.type, baseModel, etc.
    • Query for related images (using an agreed upon filename format), and insert a very basic carousel.
    • Add link to CivitAI ModelVersion and Author
  9. Add search functionality to list endpoint and implement in UI (of note: each network page will track the signature of the data being displayed, and changing network page will query this endpoint to update it's view if it is not valid for the active signature).
  10. Add sortby functionality to list endpoint, and implement in UI (as dropdown). At this time, only implement sortby for existing sorts.
  11. Add groupby functionality to list endpoint.
    • The query results for groupby just adds children property to results.
    • If children is present, clicking on the thumbnail refreshes view with the children.
    • If the children have children, and so on, with breadcrumbs.
  12. Refactor-out the folders UX to use groupby (dropdown).
  13. Implement ability for user to rate model (additional property, userRating)
  14. Implement and add to UI sortby for:
    • rating
    • createdAt
  15. Now that existing functionality have been implemented: Extend CivitAI Info downloader to additional fetch "root" Model Info and track:
    • model.allowNoCredit
    • model.allowCommercialUse
    • model.tags.*
    • model.rating
    • model.creator.*
    • model.description
    • model.title
  16. Implement groupby for other helpful properties (conditionally included only if present in any CivitAI data), such as:
    • model.tags
    • model.nsfw
    • model.poi
    • use/license
    • baseModel
    • trainingDetails.type
    • rating (rounded whole numbers)
  17. Add filter functionality (NULL, True/False, dropdown) to list endpoint for:
    • baseModel
    • baseModelType
    • model.trainingDetails.type
    • model.nsfw
    • model.poi
    • model.type
  18. Implement filters in UI.
  19. Implement system settings for options:
    • On checkpoint-change, auto-update baseModelType field to only display compatible extra networks
    • Default filters, specifically nsfw (this setting also filters images, if the information is tracked).
  20. Add generation parameters pop-up to related images (if known).
  21. Optional Server-side improvements:
    • If files.*.type = 'VAE' exists, find and use this if VAE option is set to Automatic. Optionally (via System Setting) download (at time of Model Info, or at Model Load, via setting)
    • If files.*.type = 'Config' exists, optionally download this (at same time as Model Info, or [via system setting] on Model Load)

Yes, tracking the "root" Model Info will duplicate a lot of data on the disk, but, it's bytes, and not really that big of a concern. Storing it along side the Model Version Info is best-case (follows the Model Version, and transparent, no DB)

midcoastal commented 6 months ago

I mean, that's a pretty comprehensive and thought-out road-map if I've ever seen one... Even in order of complexity/impact... ver-nice...

vladmandic commented 6 months ago

this may be too detailed to respond to each item separately. lets bring it back a bit:

midcoastal commented 6 months ago

Yeah, it seems you got lost in there a bit. I have a meeting this morning, but will come back to this in a bit.

midcoastal commented 6 months ago

this may be too detailed to respond to each item separately. lets bring it back a bit:

Sure. I gathered that I was getting long-winded, and my short solution was going to be: "How about this. This is a notice that I want to do some work on EN UI. How about I submit PRs, and I can argue the merit of them each, and go from there. Maybe you'll understand my goals better that way."

But it is also probably a good idea to cover some bases, first, as pleading the case for each PR could be time-consuming (see my other two outstanding PRs). And I know neither of us have time for that.

  • there is no HTML fetch from civitai or HMTL parsing or anything like that.

Uh... It actually does. The "description" field, that is then saved and used for display in the WebUI, may contain (and often does) HTML. You may not have noticed this because it seems the front-end strips all HTML when rendering the text into the textbox. This is "safe," but see below.

Idea of fetching HTML and then sanitizing it to MD? Where does this come from?

Well, it would be swell if (so as to prevent accidental editing as I have done, and also make things look nicer) we rendered the description(s) with "allowed" HTML formatting.

Taking a step back; it would be nice if it weren't' editable by default, and were inserted in, and editable on demand, not as default. If it were inserted, it would have to be sanitized to prevent injection.

Taking a step forward; It would be nice to render it with "allowed" styling because styling is accepted at it's inception. This means that Authors are able to use styling to structure the description, and as such, if that structure is removed, ti may make the description un-readable (I have many that look like nonsense because lists and whatnot are stripped).

Yes, this is a "nice to have." Skip-able, or at least postpone-able.

And its a no.

Ok, fine. Things don't all hang on it, so I'm fine with that. That being said, since the system currently doesn't sanitize HTML in any manner, this leads to an issue where I have a metric TON of unicode character codes that are transmitted on page load and super-bloat my page size. FWIW, converting the incoming HTML to markdown is pretty straight-forward with markdownify.

Considering we currently don't display the descriptions as static anyway, it's moot. Perhaps as some point I will make a PR that does both (display as static with trigger-able edit, and markdown), and we can discuss the merits for it independently.

  • Anything that is in that JSON can be used to display info somewhere else in a nicer way as well, but there is and will not be additional fetches from civitai. So look into what IS available as we can talk about search/filtering/tagging/whatever.

Sure, let's go with that. I will use what is available and go from there.

To clarify, when you say there "will not be" additional fetches to CivitAI, you are referring to a user performing an "update fetch," right? As in, I already have the data, but I want to fetch it again. Referring to the "merging" action I had mentioned. Correct?

Regardless: check for available data, and only use the detected available data. Got it.

  • multiple images per model would require display control that can nicely handle multiple images. gr.Image cannot. and i will not write (nor accept a pr) that does a full rewrite of extra networks interface just to accomodate that. and anything short of the full rewrite would just not achieve what you'd want.

That is not necessarily true... There are JS events all over the place. Accomplishing what I am considering would be hooking into a JS event, performing a query, and replacing the DOM element. I would not have to (nor would I want to) touch any Gradio elements.

The same can't entirely be said for the other end-goals mentioned. Ultimately, yes, there would be changes to the Networks UI, and in the end, it would look pretty different.

That said, I should reiterate that the "list" I made above would be incremental changes over disparate PRs. So you would not be looking the future potential of staring down a monolithic PR with thousands of line-changes. You would eb looking at several PRs with incrementing, manageable, documented, and explained changes.

  • imo, group by would very fast baloon to massive feature that would be very rarely used. nice? sure. if it can be done in absolutely minimal way using existing information (as stated above).

As mentioned above, as outlined, I would increment on this:

Really, I would be sated with the first two. After-which, the framework is there. if someone get's creative they may want to add something down the line. At which point they can propose it, or do it themselves. Frankly, just getting the frameworks in there is enough, as ti could be an extension/plugin after that point.

  • i'd be ok with minor PRs such as option to hide folder view.

Hiding the folder view is the least of the issues...

As I consider this more and more, can I offer a bit of a middle-ground?

May I submit some PRs initially to:

A non-trivial PR I would additionally like to submit is, yes, some changes to the Networks page(s) to address (in this order, via separate PRs):

Paying very close attention so as not to change pre-existing functionality/behavior.

Also, I will be sure to try to keep my PR's as un-complicated as possible for the case.

For now, just forget the huge list I made earlier, that's a bridge to cross another time. What do you think about the above?

vladmandic commented 6 months ago

re: HTML correct, there is HTML embedded in json description field. it could be translated to MD, but it would come at significant cost - both translating and even more rendering. but overal, i'm not against it

re: additional fetches i meant additional. of course user can do a refresh. i meant things like lookup info for parent model and things like that.

re: carusel images yes, its possible to use js to change img element of the gr.Image component - that is what live preview does but its anything but nice and clean. also, really not sure i want to go in that direction - is it really on sdnext to display all that? i get your point about model may go offline on civitai, etc. - but its not on sdnext to solve that. sdnext extra networks should be quick & easy interface, not a full-blown civitai replacement.

re: group by one thing i'm worried about is that most of such info on civitai is just bad - authors do not fill it correctly. so having group-by may just backfire and show how bad it is. but i'm not against it.

May I submit some PRs initially to:

Add the ability to create arbitrary subscribe-able events in back-end as well as front-end.

example?

Add several new subscribe-able events to back-end and front-end.

example?

A non-trivial PR I would additionally like to submit is, yes, some changes to the Networks page(s) to address (in this order, via separate PRs):

Initial List population

what do you mean?

Search

most likely that is a yes.

Folders

most likely that is a yes.

midcoastal commented 6 months ago

WRT "Arbitrary Events", I haven't considered options yet, so giving an example would be difficult. I have experience in about a dozen different design philosophies for event handlers, and I would have to sit and look over library options, or consider how to implement something bare-bones that is lightweight and simple for the project.

I am familiar with the way that the event system is currently designed, and want to impress two things:

Anything i do I would be sure to:

I understand that a big concern with anything like this is the fact that you don't know how long I will be around or helping to look at things, and in the end, you need to look out for yourself WRT the fact that maintenance may/will fall on your shoulders. I get that. I will do better at keeping that in consideration.

WRT "additional fetches" that's fine, sure. My other questions, and new considerations on the matter have addressed that idea. That is fine.

WRT Images: ignore this as well.

WRT GroupBy: Ignore this, too. lol

For the most part, you can forget about and ignore most of anything else I have said. I understand your position and concerns.

WRT initial list population: I want to move this out of the core HTML, and have it strictly as an API. UI loads, and does not fetch the EN Pages data until the page is loaded. In many respects this is no different than the existing option to not generate the pages until requested, only different in the fact that it does generate the information, it just doesn't populate the page wit hit, and does it automatically instead of manually.

Search: currently this is user-side, i was going to look in to making is server side. but you know what? Forget that, as well.

In fact, forget everything, all I want is events. Let me get some events in there to hook in to, just a few, and you won't have to deal with a single other feature.

vladmandic commented 6 months ago

regarding ui population, i'd be ok with moving html generate code from server-side to browser. and events, totally fine - i was just curious if you had something specific in mind.

what's left is hide folders - you can create pr for that, no issue.

regarding metadata that does exists and is passed to client, note that there is some sanitization since to remove huge objects since i've seen metadata that easily ranges in megabytes.

            if 'modelVersions' in fullinfo: # sanitize massive objects
                fullinfo['modelVersions'] = []

this can probably be improved a lot since it basically strips everything.

midcoastal commented 6 months ago

I wasn't saying there was "no" sanitization. And TBH it is moot anyway. If I get events in there, I can implement the rest via a Plugin, and then none of it is on your shoulders. Easy peasy.

vladmandic commented 6 months ago

for sure. i was just pointing that using existing endpoints will return overly-sanitized data and if you want, you can modify that to have better sanitization, but still do some.