AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
143.8k stars 27.05k forks source link

[Feature Request]: Support Zero Terminal SNR betas for full dynamic range on any model trained with ZTSNR #13052

Closed Goldenkoron closed 10 months ago

Goldenkoron commented 1 year ago

Is there an existing issue for this?

What would your feature do ?

Hi, I discovered a few days ago that any model trained with zero terminal SNR (including Epsilon models!) can perform full dynamic range in generations. I am attaching some image examples of what this can allow for:

image

See more images

![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/3c140c15-7eb2-427c-a844-48c5f9213e9e) ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/cb1b5491-7a69-4884-a2a5-5db3ff4ed92a) ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/f113afdc-af2a-44e7-b4b7-13cdd3802e6f) ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/9e4b3120-3b06-4c9d-a148-0fdb10cdba5b) ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/b0d18056-24ad-448e-ab13-45605d88479b) ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/6cffd96c-356a-4bbc-8894-4f25a6ac3737)

Images generated by loading the model in diffusers format in Invoke. (Diffusers format has the trained betas in scheduler.json)

This is not like zero frequency (offset) noise where it can make models darker. The issue with zero frequency noise is that it forces a bias toward darker images in the whole model and it does not have much detail for those dark images either. Zero terminal SNR appears to fully train both bright and dark and has no noticeable bias toward one or the other. You can prompt for a pure white image just as well as you can prompt for a pure black image.

For this feature request, I am not smart enough to know the exact method to implement this, but what needs to happen is there needs to be a setting to set the zero terminal SNR betas for all 1000 timesteps in inference. I have attached a diffusers scheduler json that has all the timesteps that would need to be used. scheduler_config.txt

One issue worth noting with ZTSNR betas is that it may cause issues when being run with Karras samplers, but I believe it's a worthwhile drawback for the quality of images and flexibility that this can offer. If anyone can make an extension for enabling these betas or a PR for A111 it would be greatly appreciated.

Here is a anime themed diffusers format ckpt with Zero Terminal SNR for any devs who want to test this. https://1drv.ms/u/s!AoMNcaD7FYzdsgPAeCcWaHFske-W?e=CaikZR

Proposed workflow

Read above

Additional information

No response

Hosiokaa commented 1 year ago

How about LoRAs trained with ZTSNR? will we need to edit beta values or something too? sd-scripts atm allows you to train them. @catboxanon

catboxanon commented 1 year ago

I deleted the comment I made because there was some information I had wrong. Someone else more knowledgeable than me will need to look into this.

victorchall commented 1 year ago

The paper has a code snippet. After the beta schedule is calculated it just needs to be run through this:

def enforce_zero_terminal_snr(betas):
    # Convert betas to alphas_bar_sqrt
    alphas = 1 - betas
    alphas_bar = alphas.cumprod(0)
    alphas_bar_sqrt = alphas_bar.sqrt()

    # Store old values.
    alphas_bar_sqrt_0 = alphas_bar_sqrt[0].clone()
    alphas_bar_sqrt_T = alphas_bar_sqrt[-1].clone()
    # Shift so last timestep is zero.
    alphas_bar_sqrt -= alphas_bar_sqrt_T
    # Scale so first timestep is back to old value.
    alphas_bar_sqrt *= alphas_bar_sqrt_0 / (
    alphas_bar_sqrt_0 - alphas_bar_sqrt_T)

    # Convert alphas_bar_sqrt to betas
    alphas_bar = alphas_bar_sqrt ** 2
    alphas = alphas_bar[1:] / alphas_bar[:-1]
    alphas = torch.cat([alphas_bar[0:1], alphas])
    betas = 1 - alphas

https://arxiv.org/pdf/2305.08891.pdf page 3 image

Goldenkoron commented 1 year ago

I confirmed these betas do affect non ztsnr trained models but I think they are lacking in details and possibly more fried without the special training for it.

Images

Normal SD 1.4 ![ebc5d073-ad58-4c53-8dcd-47a0524238b8](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/77ec397f-01f8-4154-bb45-bd4680cb7dbf) SD 1.4 with betas ![0f576988-5cb3-48a8-b725-52fd967d397e](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/e2a7f17e-8d42-4438-be19-0f78feaa052a) Normal SD 1.4 ![032e4e3f-1771-4671-9b71-407ab744cadd](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/06f1acde-ae4a-4221-aa4c-64b41cac7436) SD 1.4 with betas ![7b87792a-162b-4321-a00c-c9d06c10052e](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/b634a945-7837-4672-ac19-c76dffcad650)

Goldenkoron commented 1 year ago

Loras with ZTSNR betas used on models NOT trained with ZTSNR do work!

Images

Vanilla model with normal betas on ZTSNR false (control) lora ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/abfe6405-a908-4ac0-a0fb-dc61bbe7c368) Vanilla model with normal betas on ZTSNR false (control) lora - You can see how broken it is without ZTSNR training. ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/4a0f1f03-d6b8-416c-a5a9-cca3093ccd4d) Vanilla model with normal betas on ZTSNR true lora ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/9a6fa0d1-b0a4-45cd-b75b-f772b328cab3) Vanilla model with ZTSNR betas on ZTSNR true lora ![image](https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/112432266/0a1709f1-cff4-43aa-9d90-215ef04d721e)

Same seed and prompt for all images.

This means ZTSNR could provide benefits to models pretty rapidly if lora makers finetune with it and if the betas can be used in A111.

woweenie commented 1 year ago

with my patch you can put given_betas: [] in the yaml under model.params below the line use_emas: False, fill it with the betas from the diffusion model and it should work, like this:

    use_ema: False
    given_betas: [
    0.0008500218391418457,
    0.0009172558784484863,
    0.0009224414825439453,
    0.0009275078773498535,
    0.0009327530860900879,
    0.0009377002716064453,
    0.0009428262710571289,
    0.0009481310844421387,
    0.0009533166885375977,
    0.0009582042694091797,
    0.0009638071060180664,
    0.0009694695472717285,
    0.0009732842445373535,
    0.0009794235229492188,
    0.0009845495223999023,
    0.0009899139404296875,
    0.0009950995445251465,
    0.0010003447532653809,
    0.0010057687759399414,
    0.0010110735893249512,
    0.0010165572166442871,
    0.001021742820739746,
    0.001027226448059082,
    0.0010326504707336426,
    0.001037895679473877,
    0.0010434985160827637,
    0.0010486841201782227,
    0.0010541677474975586,
    0.0010597705841064453,
    0.0010653138160705566,
    0.0010707974433898926,
    0.0010761022567749023,
    0.0010816454887390137,
    0.0010873079299926758,
    0.0010932087898254395,
    0.0010977983474731445,
    0.0011042356491088867,
    0.0011095404624938965,
    0.0011151432991027832,
    0.0011214017868041992,
    0.001125931739807129,
    0.0011319518089294434,
    0.0011376738548278809,
    0.0011433959007263184,
    0.0011491179466247559,
    0.0011547207832336426,
    0.0011605024337768555,
    0.0011661648750305176,
    0.0011718273162841797,
    0.0011782050132751465,
    0.0011830329895019531,
    0.0011895298957824707,
    0.001195371150970459,
    0.0012010931968688965,
    0.0012068748474121094,
    0.0012128353118896484,
    0.0012180209159851074,
    0.0012247562408447266,
    0.0012300610542297363,
    0.0012364387512207031,
    0.0012422800064086914,
    0.001248002052307129,
    0.001254260540008545,
    0.0012597441673278809,
    0.0012664794921875,
    0.0012715458869934082,
    0.001278221607208252,
    0.0012841224670410156,
    0.0012905597686767578,
    0.001295924186706543,
    0.0013020038604736328,
    0.0013085007667541504,
    0.001314401626586914,
    0.00132066011428833,
    0.0013267993927001953,
    0.0013331770896911621,
    0.0013387799263000488,
    0.0013451576232910156,
    0.0013514161109924316,
    0.0013576745986938477,
    0.0013638138771057129,
    0.0013698935508728027,
    0.0013767480850219727,
    0.0013823509216308594,
    0.001388847827911377,
    0.0013956427574157715,
    0.0014013051986694336,
    0.001407921314239502,
    0.0014140009880065918,
    0.0014207959175109863,
    0.0014268159866333008,
    0.001433253288269043,
    0.001439809799194336,
    0.0014462471008300781,
    0.0014527440071105957,
    0.0014590620994567871,
    0.0014656782150268555,
    0.0014722943305969238,
    0.0014782547950744629,
    0.0014848113059997559,
    0.001491248607635498,
    0.0014981627464294434,
    0.0015046000480651855,
    0.0015107989311218262,
    0.001517653465270996,
    0.0015247464179992676,
    0.0015308856964111328,
    0.0015375018119812012,
    0.0015448331832885742,
    0.0015504956245422363,
    0.0015575885772705078,
    0.0015639662742614746,
    0.0015714168548583984,
    0.001577615737915039,
    0.0015848875045776367,
    0.0015906095504760742,
    0.0015984773635864258,
    0.0016042590141296387,
    0.0016115307807922363,
    0.0016184449195861816,
    0.0016255378723144531,
    0.00163191556930542,
    0.0016388297080993652,
    0.0016461014747619629,
    0.0016530156135559082,
    0.0016592741012573242,
    0.0016667842864990234,
    0.0016731023788452148,
    0.0016801953315734863,
    0.0016875863075256348,
    0.0016944408416748047,
    0.0017009377479553223,
    0.0017086267471313477,
    0.001715242862701416,
    0.001722574234008789,
    0.0017295479774475098,
    0.0017365217208862305,
    0.0017437338829040527,
    0.0017514824867248535,
    0.0017573237419128418,
    0.0017647147178649902,
    0.0017722249031066895,
    0.0017793774604797363,
    0.00178605318069458,
    0.001793503761291504,
    0.00180131196975708,
    0.0018086433410644531,
    0.0018146038055419922,
    0.0018224120140075684,
    0.0018298625946044922,
    0.0018370747566223145,
    0.001844644546508789,
    0.0018523931503295898,
    0.0018584728240966797,
    0.001866459846496582,
    0.001874089241027832,
    0.0018810033798217773,
    0.0018887519836425781,
    0.0018960833549499512,
    0.0019035935401916504,
    0.0019109845161437988,
    0.0019182562828063965,
    0.001926124095916748,
    0.001933276653289795,
    0.0019407868385314941,
    0.0019481778144836426,
    0.0019559860229492188,
    0.0019639134407043457,
    0.0019705891609191895,
    0.001978754997253418,
    0.0019863247871398926,
    0.001993715763092041,
    0.0020017027854919434,
    0.0020092129707336426,
    0.0020166635513305664,
    0.002024233341217041,
    0.0020323991775512695,
    0.0020402073860168457,
    0.002047717571258545,
    0.0020551681518554688,
    0.0020636916160583496,
    0.0020705461502075195,
    0.0020786523818969727,
    0.0020866990089416504,
    0.0020949840545654297,
    0.0021019577980041504,
    0.002109825611114502,
    0.002118349075317383,
    0.002125859260559082,
    0.0021338462829589844,
    0.002141714096069336,
    0.002149641513824463,
    0.00215756893157959,
    0.002165675163269043,
    0.0021734237670898438,
    0.002181410789489746,
    0.0021899938583374023,
    0.0021973848342895508,
    0.0022058486938476562,
    0.0022140145301818848,
    0.002221822738647461,
    0.0022298693656921387,
    0.0022379159927368164,
    0.002246379852294922,
    0.002254486083984375,
    0.002262711524963379,
    0.0022707581520080566,
    0.002279043197631836,
    0.0022867918014526367,
    0.0022953152656555176,
    0.0023038387298583984,
    0.0023116469383239746,
    0.0023201704025268555,
    0.0023288726806640625,
    0.002336740493774414,
    0.002345263957977295,
    0.0023533105850219727,
    0.002361774444580078,
    0.0023704171180725098,
    0.0023787617683410645,
    0.0023868680000305176,
    0.0023956298828125,
    0.0024040937423706055,
    0.002412557601928711,
    0.002421081066131592,
    0.002429187297821045,
    0.0024385452270507812,
    0.002446293830871582,
    0.002455413341522217,
    0.0024638772010803223,
    0.002472221851348877,
    0.002480149269104004,
    0.0024901628494262695,
    0.002497553825378418,
    0.0025069117546081543,
    0.002515852451324463,
    0.0025241971015930176,
    0.002532780170440674,
    0.0025412440299987793,
    0.002550184726715088,
    0.002559483051300049,
    0.0025669336318969727,
    0.00257718563079834,
    0.002585172653198242,
    0.0025940537452697754,
    0.0026030540466308594,
    0.002611994743347168,
    0.0026208162307739258,
    0.0026292800903320312,
    0.0026383399963378906,
    0.0026478171348571777,
    0.002656400203704834,
    0.002665221691131592,
    0.002674877643585205,
    0.002682983875274658,
    0.0026923418045043945,
    0.002702176570892334,
    0.0027101635932922363,
    0.002719700336456299,
    0.0027282238006591797,
    0.002738475799560547,
    0.002746284008026123,
    0.0027570724487304688,
    0.002764284610748291,
    0.002774953842163086,
    0.002782881259918213,
    0.0027927756309509277,
    0.0028020739555358887,
    0.002811431884765625,
    0.0028204917907714844,
    0.002829909324645996,
    0.0028390884399414062,
    0.0028487443923950195,
    0.0028580427169799805,
    0.002867281436920166,
    0.0028753280639648438,
    0.0028862953186035156,
    0.0028952360153198242,
    0.0029050111770629883,
    0.0029141902923583984,
    0.0029234886169433594,
    0.0029335618019104004,
    0.0029422640800476074,
    0.0029524564743041992,
    0.0029615163803100586,
    0.0029709935188293457,
    0.002980530261993408,
    0.0029913783073425293,
    0.002999544143676758,
    0.003009796142578125,
    0.0030192136764526367,
    0.0030287504196166992,
    0.003038942813873291,
    0.003047943115234375,
    0.0030582547187805176,
    0.0030676722526550293,
    0.003077208995819092,
    0.003087341785430908,
    0.003097712993621826,
    0.003106415271759033,
    0.0031170248985290527,
    0.003126680850982666,
    0.00313645601272583,
    0.003145933151245117,
    0.0031561851501464844,
    0.0031662583351135254,
    0.0031759142875671387,
    0.0031861066818237305,
    0.0031962990760803223,
    0.0032057762145996094,
    0.0032159090042114258,
    0.003226041793823242,
    0.0032364726066589355,
    0.003246307373046875,
    0.003256380558013916,
    0.0032666921615600586,
    0.003276348114013672,
    0.0032868385314941406,
    0.0032967329025268555,
    0.0033075809478759766,
    0.0033178329467773438,
    0.003326892852783203,
    0.0033375024795532227,
    0.0033484697341918945,
    0.0033582448959350586,
    0.003368675708770752,
    0.003379344940185547,
    0.0033890604972839355,
    0.003399789333343506,
    0.0034102201461791992,
    0.003420889377593994,
    0.0034308433532714844,
    0.0034407973289489746,
    0.0034522414207458496,
    0.003462553024291992,
    0.0034732818603515625,
    0.0034839510917663574,
    0.003493785858154297,
    0.003504455089569092,
    0.0035154223442077637,
    0.0035257339477539062,
    0.0035365819931030273,
    0.00354689359664917,
    0.0035581588745117188,
    0.003568410873413086,
    0.0035791993141174316,
    0.0035900473594665527,
    0.003600478172302246,
    0.003611743450164795,
    0.0036221742630004883,
    0.0036329030990600586,
    0.003644227981567383,
    0.003654301166534424,
    0.0036659836769104004,
    0.0036767125129699707,
    0.0036875009536743164,
    0.0036984682083129883,
    0.0037094950675964355,
    0.003720581531524658,
    0.003731369972229004,
    0.0037425756454467773,
    0.0037537813186645508,
    0.003764629364013672,
    0.003775477409362793,
    0.003787100315093994,
    0.0037979483604431152,
    0.003809034824371338,
    0.00382077693939209,
    0.0038314461708068848,
    0.003842592239379883,
    0.003854036331176758,
    0.003865063190460205,
    0.003876924514770508,
    0.0038881301879882812,
    0.0038990378379821777,
    0.003910362720489502,
    0.003922224044799805,
    0.003933310508728027,
    0.003944575786590576,
    0.003956496715545654,
    0.003967881202697754,
    0.003979325294494629,
    0.003989994525909424,
    0.004002988338470459,
    0.004013717174530029,
    0.004025161266326904,
    0.004037261009216309,
    0.004048764705657959,
    0.004060089588165283,
    0.004072129726409912,
    0.004083752632141113,
    0.004095613956451416,
    0.004106879234313965,
    0.0041190385818481445,
    0.004130899906158447,
    0.004142820835113525,
    0.004154026508331299,
    0.0041658878326416016,
    0.004178524017333984,
    0.004190385341644287,
    0.004201412200927734,
    0.004214227199554443,
    0.004225671291351318,
    0.004237830638885498,
    0.004249870777130127,
    0.004262387752532959,
    0.00427401065826416,
    0.004286408424377441,
    0.004298031330108643,
    0.004310488700866699,
    0.004322469234466553,
    0.004334747791290283,
    0.004347085952758789,
    0.004359126091003418,
    0.004372000694274902,
    0.004383862018585205,
    0.004396498203277588,
    0.00440824031829834,
    0.004420816898345947,
    0.004433393478393555,
    0.0044463276863098145,
    0.00445789098739624,
    0.004470348358154297,
    0.004483222961425781,
    0.0044956207275390625,
    0.004508674144744873,
    0.004520416259765625,
    0.0045334696769714355,
    0.004546165466308594,
    0.004558980464935303,
    0.004571378231048584,
    0.004584074020385742,
    0.004596889019012451,
    0.004609465599060059,
    0.004622280597686768,
    0.004635155200958252,
    0.004648387432098389,
    0.004660665988922119,
    0.004674375057220459,
    0.0046866536140441895,
    0.004699885845184326,
    0.00471264123916626,
    0.004725635051727295,
    0.004738450050354004,
    0.004751682281494141,
    0.004765212535858154,
    0.00477832555770874,
    0.004790604114532471,
    0.004804670810699463,
    0.004817306995391846,
    0.004831194877624512,
    0.004844725131988525,
    0.0048563480377197266,
    0.004870593547821045,
    0.00488436222076416,
    0.004896759986877441,
    0.004910528659820557,
    0.004924654960632324,
    0.0049370527267456055,
    0.004951179027557373,
    0.004964709281921387,
    0.004978299140930176,
    0.004991710186004639,
    0.0050054192543029785,
    0.005018711090087891,
    0.005032896995544434,
    0.0050466060638427734,
    0.005059182643890381,
    0.005074262619018555,
    0.005088448524475098,
    0.0051003098487854,
    0.005115389823913574,
    0.005129396915435791,
    0.005142867565155029,
    0.0051569342613220215,
    0.005171000957489014,
    0.005184590816497803,
    0.005199551582336426,
    0.005212724208831787,
    0.005227208137512207,
    0.0052411556243896484,
    0.005255460739135742,
    0.005268871784210205,
    0.0052841901779174805,
    0.005298614501953125,
    0.005312025547027588,
    0.005326569080352783,
    0.00534135103225708,
    0.005354821681976318,
    0.005368828773498535,
    0.005385160446166992,
    0.005398690700531006,
    0.005412280559539795,
    0.005429387092590332,
    0.00544130802154541,
    0.005456745624542236,
    0.005471646785736084,
    0.005486667156219482,
    0.005501151084899902,
    0.005515575408935547,
    0.005530834197998047,
    0.005545318126678467,
    0.005560398101806641,
    0.005574941635131836,
    0.00559002161026001,
    0.005605459213256836,
    0.0056198835372924805,
    0.0056348443031311035,
    0.0056501030921936035,
    0.0056656599044799805,
    0.0056806206703186035,
    0.005695760250091553,
    0.0057108402252197266,
    0.005725979804992676,
    0.005741417407989502,
    0.0057569146156311035,
    0.00577235221862793,
    0.005787253379821777,
    0.005802810192108154,
    0.005818724632263184,
    0.005833566188812256,
    0.005849182605743408,
    0.005864977836608887,
    0.005881130695343018,
    0.005896508693695068,
    0.005912482738494873,
    0.005927324295043945,
    0.005943775177001953,
    0.005959808826446533,
    0.005974948406219482,
    0.0059909820556640625,
    0.006007552146911621,
    0.006022989749908447,
    0.00603938102722168,
    0.006054997444152832,
    0.0060713887214660645,
    0.006087839603424072,
    0.006103694438934326,
    0.006119847297668457,
    0.00613635778427124,
    0.006152689456939697,
    0.006168782711029053,
    0.006185770034790039,
    0.006201505661010742,
    0.006218373775482178,
    0.006235182285308838,
    0.006251215934753418,
    0.006268501281738281,
    0.006284773349761963,
    0.006301164627075195,
    0.0063179731369018555,
    0.00633549690246582,
    0.0063517093658447266,
    0.0063689351081848145,
    0.006385624408721924,
    0.0064026713371276855,
    0.006419658660888672,
    0.006437480449676514,
    0.0064542293548583984,
    0.006470918655395508,
    0.006488621234893799,
    0.006505727767944336,
    0.0065225958824157715,
    0.006540477275848389,
    0.006558060646057129,
    0.006575345993041992,
    0.006593048572540283,
    0.006609857082366943,
    0.006628274917602539,
    0.0066457390785217285,
    0.006663620471954346,
    0.00668102502822876,
    0.006698668003082275,
    0.0067171454429626465,
    0.006734728813171387,
    0.0067525506019592285,
    0.006770789623260498,
    0.00678938627243042,
    0.006806790828704834,
    0.006824612617492676,
    0.0068438053131103516,
    0.006861627101898193,
    0.006880223751068115,
    0.0068985819816589355,
    0.006916463375091553,
    0.006935298442840576,
    0.006954014301300049,
    0.006972789764404297,
    0.006990969181060791,
    0.007010281085968018,
    0.0070288777351379395,
    0.007047832012176514,
    0.007066965103149414,
    0.007085263729095459,
    0.007104635238647461,
    0.0071242451667785645,
    0.007142901420593262,
    0.007162034511566162,
    0.007181107997894287,
    0.007200479507446289,
    0.007220864295959473,
    0.0072386860847473145,
    0.007259845733642578,
    0.007278621196746826,
    0.007298290729522705,
    0.007318437099456787,
    0.007337510585784912,
    0.007357895374298096,
    0.007377743721008301,
    0.007397353649139404,
    0.007417917251586914,
    0.007437765598297119,
    0.007457613945007324,
    0.007478535175323486,
    0.007498264312744141,
    0.007518768310546875,
    0.007539212703704834,
    0.007559359073638916,
    0.007580697536468506,
    0.00760120153427124,
    0.0076212286949157715,
    0.0076427459716796875,
    0.007663369178771973,
    0.007684588432312012,
    0.007705092430114746,
    0.00772625207901001,
    0.0077474117279052734,
    0.007768690586090088,
    0.007789969444274902,
    0.007811844348907471,
    0.007832884788513184,
    0.00785452127456665,
    0.007876038551330566,
    0.007897257804870605,
    0.00792008638381958,
    0.00794130563735962,
    0.007962942123413086,
    0.007985591888427734,
    0.00800710916519165,
    0.008029937744140625,
    0.008051633834838867,
    0.00807410478591919,
    0.008097052574157715,
    0.008118867874145508,
    0.008141577243804932,
    0.008164584636688232,
    0.008187174797058105,
    0.00821012258529663,
    0.008232593536376953,
    0.008256018161773682,
    0.008279025554656982,
    0.008302688598632812,
    0.00832509994506836,
    0.00834888219833374,
    0.008373141288757324,
    0.008396506309509277,
    0.008419632911682129,
    0.008443653583526611,
    0.00846719741821289,
    0.008491933345794678,
    0.00851505994796753,
    0.008539736270904541,
    0.008563697338104248,
    0.00858837366104126,
    0.00861281156539917,
    0.008637487888336182,
    0.008662581443786621,
    0.00868690013885498,
    0.008710741996765137,
    0.008737146854400635,
    0.008761823177337646,
    0.00878608226776123,
    0.008812248706817627,
    0.008837699890136719,
    0.008862912654876709,
    0.008888483047485352,
    0.008915126323699951,
    0.00894021987915039,
    0.008966386318206787,
    0.008992254734039307,
    0.009018003940582275,
    0.009044826030731201,
    0.00907135009765625,
    0.009098708629608154,
    0.009124755859375,
    0.009151577949523926,
    0.0091782808303833,
    0.009204745292663574,
    0.009232401847839355,
    0.009259521961212158,
    0.009286999702453613,
    0.009315252304077148,
    0.009342372417449951,
    0.009370207786560059,
    0.009398221969604492,
    0.009427011013031006,
    0.009454667568206787,
    0.009483277797698975,
    0.009511351585388184,
    0.009540379047393799,
    0.009568989276885986,
    0.009597718715667725,
    0.00962686538696289,
    0.009656369686126709,
    0.009685277938842773,
    0.009715139865875244,
    0.009745001792907715,
    0.009774446487426758,
    0.009804785251617432,
    0.00983428955078125,
    0.009865403175354004,
    0.009894788265228271,
    0.009926080703735352,
    0.009956538677215576,
    0.009987831115722656,
    0.010018587112426758,
    0.010049521923065186,
    0.010081470012664795,
    0.010113120079040527,
    0.010144650936126709,
    0.01017671823501587,
    0.010208725929260254,
    0.010240912437438965,
    0.010273337364196777,
    0.010305941104888916,
    0.010338902473449707,
    0.01037222146987915,
    0.010405123233795166,
    0.010438680648803711,
    0.010471940040588379,
    0.010505259037017822,
    0.010539770126342773,
    0.010574400424957275,
    0.010608792304992676,
    0.010642647743225098,
    0.010677158832550049,
    0.01071244478225708,
    0.010748088359832764,
    0.010783255100250244,
    0.010819315910339355,
    0.01085442304611206,
    0.01089024543762207,
    0.010927796363830566,
    0.010962724685668945,
    0.011000216007232666,
    0.011038064956665039,
    0.011074244976043701,
    0.011111319065093994,
    0.01114964485168457,
    0.011187493801116943,
    0.011225223541259766,
    0.011263728141784668,
    0.011302947998046875,
    0.011341273784637451,
    0.011380553245544434,
    0.011419475078582764,
    0.011459946632385254,
    0.011499464511871338,
    0.0115395188331604,
    0.011580109596252441,
    0.01162099838256836,
    0.011662125587463379,
    0.011703431606292725,
    0.011745035648345947,
    0.01178652048110962,
    0.011829257011413574,
    0.011871755123138428,
    0.011914312839508057,
    0.011956989765167236,
    0.012000203132629395,
    0.01204448938369751,
    0.012088418006896973,
    0.012132704257965088,
    0.012177467346191406,
    0.01222217082977295,
    0.012267529964447021,
    0.012313604354858398,
    0.012359797954559326,
    0.012406349182128906,
    0.01245194673538208,
    0.012499570846557617,
    0.012546956539154053,
    0.012594819068908691,
    0.01264333724975586,
    0.012691140174865723,
    0.01274120807647705,
    0.012790083885192871,
    0.012840092182159424,
    0.012889444828033447,
    0.012940168380737305,
    0.012990474700927734,
    0.013043224811553955,
    0.013093650341033936,
    0.013147056102752686,
    0.013199448585510254,
    0.013252317905426025,
    0.013305604457855225,
    0.01336050033569336,
    0.013413965702056885,
    0.0134696364402771,
    0.013524651527404785,
    0.013580501079559326,
    0.013637065887451172,
    0.01369398832321167,
    0.013750135898590088,
    0.01381009817123413,
    0.013867855072021484,
    0.013925552368164062,
    0.01398611068725586,
    0.01404726505279541,
    0.014106512069702148,
    0.01416844129562378,
    0.014230132102966309,
    0.014291882514953613,
    0.014355003833770752,
    0.014418423175811768,
    0.014482855796813965,
    0.014547407627105713,
    0.014613032341003418,
    0.014678895473480225,
    0.014745771884918213,
    0.014813363552093506,
    0.0148811936378479,
    0.014949977397918701,
    0.015019059181213379,
    0.01509004831314087,
    0.015159666538238525,
    0.015232086181640625,
    0.015304505825042725,
    0.015378057956695557,
    0.015451312065124512,
    0.015526175498962402,
    0.015601634979248047,
    0.01567775011062622,
    0.01575523614883423,
    0.01583343744277954,
    0.015912294387817383,
    0.015991806983947754,
    0.016072750091552734,
    0.01615375280380249,
    0.016236424446105957,
    0.016320526599884033,
    0.01640462875366211,
    0.016490697860717773,
    0.01657634973526001,
    0.016664445400238037,
    0.016752302646636963,
    0.016841769218444824,
    0.016933083534240723,
    0.017024636268615723,
    0.017117559909820557,
    0.017211318016052246,
    0.01730632781982422,
    0.01740407943725586,
    0.017500758171081543,
    0.01759999990463257,
    0.017700016498565674,
    0.01780170202255249,
    0.01790463924407959,
    0.018008649349212646,
    0.01811450719833374,
    0.018221616744995117,
    0.018330156803131104,
    0.018439829349517822,
    0.018551111221313477,
    0.01866447925567627,
    0.018779873847961426,
    0.018895328044891357,
    0.01901310682296753,
    0.019132792949676514,
    0.01925438642501831,
    0.01937699317932129,
    0.019502639770507812,
    0.019628465175628662,
    0.019758403301239014,
    0.019888579845428467,
    0.020021378993988037,
    0.020156264305114746,
    0.020292818546295166,
    0.02043241262435913,
    0.020572364330291748,
    0.020717382431030273,
    0.02086162567138672,
    0.02101057767868042,
    0.02116149663925171,
    0.021313369274139404,
    0.021470367908477783,
    0.021628201007843018,
    0.02178889513015747,
    0.021952688694000244,
    0.022119104862213135,
    0.022289156913757324,
    0.02246195077896118,
    0.022637546062469482,
    0.022816240787506104,
    0.022998690605163574,
    0.02318429946899414,
    0.023372888565063477,
    0.023565948009490967,
    0.023761868476867676,
    0.023962557315826416,
    0.0241660475730896,
    0.02437424659729004,
    0.024585843086242676,
    0.0248032808303833,
    0.02502375841140747,
    0.02524888515472412,
    0.02547931671142578,
    0.025714099407196045,
    0.025953948497772217,
    0.026199281215667725,
    0.026450157165527344,
    0.02670520544052124,
    0.026968002319335938,
    0.027235746383666992,
    0.027510106563568115,
    0.02778911590576172,
    0.0280764102935791,
    0.02837008237838745,
    0.02867037057876587,
    0.028979241847991943,
    0.02929389476776123,
    0.029617905616760254,
    0.029949843883514404,
    0.03029024600982666,
    0.030638813972473145,
    0.03099769353866577,
    0.03136587142944336,
    0.03174322843551636,
    0.03213316202163696,
    0.03253120183944702,
    0.03294253349304199,
    0.03336459398269653,
    0.03379952907562256,
    0.03424781560897827,
    0.03470879793167114,
    0.03518456220626831,
    0.035674870014190674,
    0.036179959774017334,
    0.036703407764434814,
    0.037241995334625244,
    0.03779888153076172,
    0.03837549686431885,
    0.03897124528884888,
    0.03958803415298462,
    0.040227532386779785,
    0.04088914394378662,
    0.04157602787017822,
    0.042289137840270996,
    0.043029189109802246,
    0.04379844665527344,
    0.044599175453186035,
    0.045431435108184814,
    0.046300411224365234,
    0.04720449447631836,
    0.04814964532852173,
    0.04913598299026489,
    0.05016881227493286,
    0.05124872922897339,
    0.05238139629364014,
    0.053569138050079346,
    0.0548173189163208,
    0.05612897872924805,
    0.05751073360443115,
    0.058969080448150635,
    0.06050795316696167,
    0.06213653087615967,
    0.06386172771453857,
    0.06569379568099976,
    0.06763899326324463,
    0.06971514225006104,
    0.07192808389663696,
    0.074299156665802,
    0.07683998346328735,
    0.07957100868225098,
    0.08251869678497314,
    0.08570599555969238,
    0.08916068077087402,
    0.09292548894882202,
    0.09703713655471802,
    0.1015504002571106,
    0.10652261972427368,
    0.11203312873840332,
    0.11817121505737305,
    0.1250494122505188,
    0.13281410932540894,
    0.14164000749588013,
    0.15177381038665771,
    0.16351544857025146,
    0.17729026079177856,
    0.19366109371185303,
    0.21344870328903198,
    0.23784351348876953,
    0.2686365246772766,
    0.30870169401168823,
    0.3629082441329956,
    0.4400535225868225,
    0.5575681924819946,
    0.7511318325996399,
    1.0
  ]
catboxanon commented 1 year ago

Reformatted your comment. Though your PR is fine I'm also not sure if that resolves it entirely. I implemented something locally that was similar a few weeks back, and though it did affect the output, prompting for something like solid black color or solid white color was still generating something middle grey, when I would expect the opposite if it's actually working as intended.

Goldenkoron commented 1 year ago

Reformatted your comment. Though your PR is fine I'm also not sure if that resolves it entirely. I implemented something locally that was similar a few weeks back and prompting for something like solid black color or solid white color was still generating something middle grey, when I would expect the opposite if it's actually working as intended.

I think it might have some issues working, I tested and first needed to add an import for ldm not being defined and also this error when loading any models:

AttributeError: 'Options' object has no attribute 'sd_checkpoints_limit'

changing setting sd_model_checkpoint to divineanimemix_1.safetensors [ea848e19a9]: AttributeError
Traceback (most recent call last):
  File "C:\Users\User\Desktop\SD\Automatic\stable-diffusion-webui\modules\shared.py", line 633, in set
    self.data_labels[key].onchange()
  File "C:\Users\User\Desktop\SD\Automatic\stable-diffusion-webui\modules\call_queue.py", line 14, in f
    res = func(*args, **kwargs)
  File "C:\Users\User\Desktop\SD\Automatic\stable-diffusion-webui\webui.py", line 238, in <lambda>
    shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: modules.sd_models.reload_model_weights()), call=False)
  File "C:\Users\User\Desktop\SD\Automatic\stable-diffusion-webui\modules\sd_models.py", line 746, in reload_model_weights
    sd_model = reuse_model_from_already_loaded(sd_model, checkpoint_info, timer)
  File "C:\Users\User\Desktop\SD\Automatic\stable-diffusion-webui\modules\sd_models.py", line 711, in reuse_model_from_already_loaded
    elif shared.opts.sd_checkpoints_limit > 1 and len(model_data.loaded_sd_models) < shared.opts.sd_checkpoints_limit:
  File "C:\Users\User\Desktop\SD\Automatic\stable-diffusion-webui\modules\shared.py", line 617, in __getattr__
    return super(Options, self).__getattribute__(item)
AttributeError: 'Options' object has no attribute 'sd_checkpoints_limit'
catboxanon commented 1 year ago

Loras with ZTSNR betas used on models NOT trained with ZTSNR do work!

@Goldenkoron There is an oversight you're likely making with this. It's been shown that if you have more than one LoRA in use and it was trained with offset noise, the offset noise affect ends up being "stacked", so it makes the image much more dark/light than it should be. I would assume a LoRA trained with ZTSNR has the same issue, so unless that's proven to not be the case, I would really suggest not doing so.

catboxanon commented 1 year ago

AttributeError: 'Options' object has no attribute 'sd_checkpoints_limit'

That error is unrelated to the PR. Double check you've saved your settings first. I believe this was fixed on the dev branch too.

Goldenkoron commented 1 year ago

Loras with ZTSNR betas used on models NOT trained with ZTSNR do work!

@Goldenkoron There is also an oversight you're likely making with this. It's been shown that if you have more than one LoRA in use and it was trained with offset noise, the offset noise affect ends up being "stacked", so it makes the image much more dark/light than it should be. I would assume a LoRA trained with ZTSNR has the same issue, so unless that's proven wrong I would really suggest not doing so.

I don't use loras hardly ever so I am sure there could be unforeseen issues with them.

Goldenkoron commented 1 year ago

AttributeError: 'Options' object has no attribute 'sd_checkpoints_limit'

That error is unrelated to the PR. Double check you've saved your settings first. I believe this was fixed on the dev branch too.

Okay, I fixed that. So using this yaml, converted to .txt for uploading here, it just makes the model produce NaN images, I think I have it formatted right but maybe I am mistaken.

MFB_14_SNR_v001-ep13-gs229294.txt

woweenie commented 1 year ago

Reformatted your comment. Though your PR is fine I'm also not sure if that resolves it entirely. I implemented something locally that was similar a few weeks back, and though it did affect the output, prompting for something like solid black color or solid white color was still generating something middle grey, when I would expect the opposite if it's actually working as intended.

yes you will need to use rescale_cfg to get the full effect

woweenie commented 1 year ago

AttributeError: 'Options' object has no attribute 'sd_checkpoints_limit'

That error is unrelated to the PR. Double check you've saved your settings first. I believe this was fixed on the dev branch too.

Okay, I fixed that. So using this yaml, converted to .txt for uploading here, it just makes the model produce NaN images, I think I have it formatted right but maybe I am mistaken.

MFB_14_SNR_v001-ep13-gs229294.txt

did you try different samplers, because i think karras may not work

catboxanon commented 1 year ago

yes you will need to use rescale_cfg to get the full effect

Ah, right. For those following this thread both of these extensions implement that. Either one can be used. https://github.com/ljleb/sd-webui-neutral-prompt https://github.com/Seshelle/CFG_Rescale_webui

catboxanon commented 1 year ago

I get the same issue with generations instantly failing with NaNs as well. Sampler doesn't matter. Note this only happens with the special .yaml file, the one the webui defaults to is fine. There is some warnings in the console while loading the model too.

Q:\AI\git\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py:164: RuntimeWarning: divide by zero encountered in divide
  self.register_buffer('sqrt_recip_alphas_cumprod', to_torch(np.sqrt(1. / alphas_cumprod)))
Q:\AI\git\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py:165: RuntimeWarning: divide by zero encountered in divide
  self.register_buffer('sqrt_recipm1_alphas_cumprod', to_torch(np.sqrt(1. / alphas_cumprod - 1)))

My .yaml file looks like this. last-MIA-mema+45-ep10-gs14616.txt

It would be helpful if you could share your .yaml file and upload an example image with embedded metadata/infotext so we can verify if results can be replicated. @woweenie

catboxanon commented 1 year ago

I realized something else that would be useful is if I uploaded the diffusers model shared in the OP as the CompVis version I converted, so it can be loaded into the webui. I'll edit this comment when it finishes uploading. For those that would like to convert it themselves they can use this script (the same one I used): https://gist.github.com/jachiam/8a5c0b607e38fcc585168b90c686eb05

Edit: https://pixeldrain.com/u/iH1JcRKZ

woweenie commented 1 year ago

i only tried ddim and that doesn't make the NaN at inferene. i debugged for the other samplers and there's a problem with k-diffusion, because it always tries to get max_sigma but that is inf if the last beta is 1. if you change the last beta to 0.999999 instead of 1 it doesn't give the nan warnings or crash

my yaml is the same as yours catboxanon, i'm downloading your model and will try it when it's done

Goldenkoron commented 1 year ago

i only tried ddim and that doesn't make the NaN at inferene. i debugged for the other samplers and there's a problem with k-diffusion, because it always tries to get max_sigma but that is inf if the last beta is 1. if you change the last beta to 0.999999 instead of 1 it doesn't give the nan warnings or crash

my yaml is the same as yours catboxanon, i'm downloading your model and will try it when it's done

i only tried ddim and that doesn't make the NaN at inferene. i debugged for the other samplers and there's a problem with k-diffusion, because it always tries to get max_sigma but that is inf if the last beta is 1. if you change the last beta to 0.999999 instead of 1 it doesn't give the nan warnings or crash

my yaml is the same as yours catboxanon, i'm downloading your model and will try it when it's done

Using catbox's yaml, I still get NaN on every sampler I try and complete white NaN with DDIM.

EDIT: It seems messy. on DDIM My more trained model with about 3.6 million images seen in training is always pure white, but my less trained ZTSNR model sometimes gets something but still with a bright white background. Currently nothing like how inference would look in diffusers.

non DDIM samplers won't do anything at all.

woweenie commented 1 year ago

i reset my code and tried it again with your model @catboxanon and now it does not work for me either. i think there are a lot of places where the code has optimizations that only work if there are never 1 or 0 in alphas or betas

drhead commented 1 year ago

I figured out a solution to the K-diffusion sampler issue based on how ComfyUI handles the same issue.

Firstly, you need to set the last timestep to be close to zero but not at zero, by putting this in wherever your beta rescale function is (so far I don't know where to do this other than by editing the LDM repo directly):

alphas_bar[-1] = 4.8973451890853435e-08

Second, you need to stop the webui from downcasting alphas_cumprod along with the rest of the model. There's no reason why it should have ever been downcasted to fp16 in the first place, because it immediately gets cast back to fp32 when creating a denoiser out of it. However, based on my testing, this will change seeds, even if you're not using the ZSNR schedule, but the current behavior of downcasting alphas_cumprod only to upcast them when they are actually used is clearly incorrect behavior. I use the same methods used to stop VAE downcasting.

I applied this change in modules/sd_models.py:load_model_weights():

        alphas_cumprod = model.alphas_cumprod
        model.alphas_cumprod = None

        model.half()
        model.alphas_cumprod = alphas_cumprod

I am also not sure where this is called, it made no difference in my testing, but I also made this change in load_model() in the same file just to be safe:

    if shared.cmd_opts.no_half:
        weight_dtype_conversion = None
    else:
        weight_dtype_conversion = {
            'first_stage_model': None,
            'alphas_cumprod': None,
            '': torch.float16,
        }

After this, all samplers will work fine on the zero SNR schedule.

Implementing this cleanly and optionally may be a challenge without requiring it to be a command line flag. Most likely the alpha cumprod values should be prevented from downcasting and stored, and should be changed right before sampling (either by downcasting and upcasting for compatibility mode, or by rescaling them for ZSNR mode.)

If anyone needs a clean model for testing zero terminal SNR, I have one: https://huggingface.co/drhead/ZeroDiffusion

catboxanon commented 1 year ago

@drhead I'd suggest making a PR for the changes you propose, because it's likely to not be looked at otherwise. Auto does not monitor issues. Even if it would be in a draft stage, it's better than nothing.