Karonar1 / lora_viewer

MIT License
0 stars 0 forks source link

Revisit model inference #6

Closed Karonar1 closed 1 month ago

Karonar1 commented 1 month ago

Currently we distinguish 1.5 and XL LoRA UNets based on whether tensors are named down* or input*, but this is actually a characteristic of the LoRA version, not base model. In particular, it incorrectly identifies XL models trained on One Trainer as 1.5. These need to be distinguished based on more detailed examination of layer names and/or shapes.

Karonar1 commented 1 month ago

SD and SDXL UNets can be distinguished by tensor shape: SD has 320 channels for its early attention layers, while SDXL has 640 channels. I see no easy way to distinguish SD 1.5 from SD 2.1 LoRAs - neither for the UNet nor the text encoder. While SD 2 was supposed to have a different encoder, the layers affected by LoRA seem to be the same shape.

Karonar1 commented 1 month ago

Some types of adapter can be distinguished by tensor name alone, looking at the end of the name:

Karonar1 commented 1 month ago

We can now detect a number of LoRA subtypes. The distinction between SD and SDXL UNet LoRAs has been removed, as it never worked very well, and providing incorrect results is bad. They can still be easily distinguished by the text encoder part.