Interpause / auto-sd-krita

AUTOMATIC1111 webUI + Krita Plugin with superb Inpainting
MIT License
88 stars 4 forks source link

Krita crashes when triggering second img2img generation #33

Closed ahjulstad closed 2 years ago

ahjulstad commented 2 years ago

Describe the bug Krita crashes when triggering second img2img generation

To Reproduce Steps to reproduce the behavior: Load image in krita, do first img2img generation Start second img2img generation Krita dies. No output in krita console or server console

Expected behavior New layer added.

Desktop (please complete the following information):

Additional context I think it worked OK using latest commit as of October 13th.

ahjulstad commented 2 years ago

Just tested with commit 0192d440881d4d330e401b03c1491ad74c113e0b, works OK.

Interpause commented 2 years ago

I am unable to reproduce this. Did you notice if this issue occurs only once in a while, or consistently occurs?

It is very likely related to https://github.com/Interpause/auto-sd-krita/issues/31. I notice it will freeze even when the backend is local, though it seems quite rare.

ahjulstad commented 2 years ago

With the mentioned hash it happens consistently on the second invocation of img2img in the Krita session. I have not observed any errors in the server, and do not have to restart the server.


Fra: John-Henry Lim @.> Sendt: Sunday, October 23, 2022 6:28:09 AM Til: Interpause/auto-sd-krita @.> Kopi: Åsmund Hjulstad @.>; Author @.> Emne: Re: [Interpause/auto-sd-krita] Krita crashes when triggering second img2img generation (Issue #33)

I am unable to reproduce this. Did you notice if this issue occurs only once in a while, or consistently occurs?

It is very likely related to #31https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FInterpause%2Fauto-sd-krita%2Fissues%2F31&data=05%7C01%7Cahju%40equinor.com%7C8a601196e4704bb5b22808dab4aefe17%7C3aa4a235b6e248d591957fcf05b459b0%7C0%7C0%7C638020960936170474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EL258gIp3N0e8VWqShBUs%2F%2Fn5MPQdan2wKmN8H1aDIY%3D&reserved=0.

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FInterpause%2Fauto-sd-krita%2Fissues%2F33%23issuecomment-1288001776&data=05%7C01%7Cahju%40equinor.com%7C8a601196e4704bb5b22808dab4aefe17%7C3aa4a235b6e248d591957fcf05b459b0%7C0%7C0%7C638020960936170474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Ayf5YpROKjsZiyzLowIxWDbfG4E4U6Jp92uIMAHoihA%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABSUVOJPDN5RPMANYTXBN2LWES5FTANCNFSM6AAAAAARL2XVSM&data=05%7C01%7Cahju%40equinor.com%7C8a601196e4704bb5b22808dab4aefe17%7C3aa4a235b6e248d591957fcf05b459b0%7C0%7C0%7C638020960936170474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vL4ZfaEzSsdlDkrulHxryOnS0ZrIppqi30GkNTHLYzg%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>


The information contained in this message may be CONFIDENTIAL and is intended for the addressee only. Any unauthorized use, dissemination of the information or copying of this message is prohibited. If you are not the addressee, please notify the sender immediately by return e-mail and delete this message. Thank you

Interpause commented 2 years ago

Okay to clarify a few things:

Could I also have a screenshot of the console output from launching the backend to point of crash?

In the meanwhile, seeing if pressing "Restore Defaults" under the Config tab helps.

ahjulstad commented 2 years ago

I have prepared a video reproducing the issue. In the process of doing so I discovered that the error occured with a wide selection area (768x512 pixels).

So, latest revision works fine with 512x512, crashes with 768x512. The 'old' revision works fine with 768x512 pixels.

(The krita plugin directory is symlinked to the git working directory)

After making the video I have been experimenting a bit, and I now have also experienced it working fine in latest commit for more than one img2img. It might be so that having something selected (and the krita plugin making a mask layer) has something to do with it. I note that when the mask layer is created, that becomes selected. If I reselect the layer with the generated image prior to rerunning img2img, everything seems to work fine, but if I keep the mask selected it crashes. For some reason not necessarily at the second time, sometime it happens on third or fourth, but it crashes eventually.

The webui is unaffected.

Interpause commented 2 years ago

Can you see if you can reproduce the error on the latest commit? I made all network requests and such threaded.

Well, I have reproduced it on the latest commit, but I suspect its an unrelated issue (#31). For me, it takes much more than 2 rounds of img2img at 768x512 pixels to crash, but the exact number seems random. I also noted that as the number of rounds increased, the initial lag spike on pressing the button seems to increase. This points to a memory leak.

When I have time, I will try and pinpoint exactly where the lag spike comes from.

Interpause commented 2 years ago

Under the config tab, press "Restore Defaults" and ensure both "save debug images" and "create transparency mask" are off. Then try a larger canvas size and img2img region that 768x512 and see if it can be reproduced.

New suspicion:

Interpause commented 2 years ago

Confirmed that using a larger selection region (1024x1024) results in the crash occurring sooner than later. However, turning off the mask did not help however, so this isn't due to the mask crash. Initial lag spike confirmed to increase as the number of layers increases, but why?

Turning on the transparency mask addition again when there are many layers triggers the crash, but this is after doubling the wait time and calling a function that blocks till the document is ready.

So the contribution of the transparency mask to the sudden crash is likely to be related to the memory leak rather than the layer not existing yet.

Using 2048x2048 for txt2img caused Krita to crash outright. Backend doesn't crash as image is downscaled first before being upscaled.

There is a possibility there are multiple factors/multiple causes of crash, maybe it is due to the image being sent/receive being too large besides there being a memory leak?

I don't have time right now but when I do I am either figuring out how to hook a line-by-line debugger into Krita, or putting print statements between every line like a caveman.

Main to test though is it can be reproduced on Linux. If it can't, it means its a problem with Krita on Windows, meaning I will have to come up with some creative workarounds...

EDIT: should test commit right before merging in the rewritten API. It might actually be a memory leak in the way im encoding/decoding base64 due to my unfamiliarity with pyQt.

ahjulstad commented 2 years ago

Thank you for looking into this.

Testing with commit 85643dc, it now takes several (~8?) runs of img2img for the crash to occur when testing with img2img of an entire 768x512 image.

I also tested with a larger krita image, but similar sized selection, and it now appears to krash immediately after first generation. (previous crashes occured when generation was triggered, before the server completed the task).

txt2img of 2048x2048 image crashed on first generation once, second generation once.

ahjulstad commented 2 years ago

I found kritacrash.log, installed debug symbols (as pr krita instructions with the portable version), and got crash log as here:

https://gist.github.com/ahjulstad/1370a670b178c91c10af95923c8f1aba

This was after three or four txt2img with a 2048x2048 canvas

ahjulstad commented 2 years ago

Just for kicks, I built krita from latest development version on Ubuntu under WSL, and am so far unable to reproduce. (The server is running native in windows)


Fra: John-Henry Lim @.> Sendt: Monday, October 24, 2022 1:19:23 PM Til: Interpause/auto-sd-krita @.> Kopi: Åsmund Hjulstad @.>; Author @.> Emne: Re: [Interpause/auto-sd-krita] Krita crashes when triggering second img2img generation (Issue #33)

Confirmed that using a larger selection region (1024x1024) results in the crash occurring sooner than later. However, turning off the mask did not help however, so this isn't due to the mask crash. Initial lag spike confirmed to increase as the number of layers increases, but why?

Turning on the transparency mask addition again when there are many layers triggers the crash, but this is after doubling the wait time and calling a function that blocks till the document is ready.

So the contribution of the transparency mask to the sudden crash is likely to be related to the memory leak rather than the layer not existing yet.

Using 2048x2048 for txt2img caused Krita to crash outright. Backend doesn't crash as image is downscaled first before being upscaled.

There is a possibility there are multiple factors/multiple causes of crash, maybe it is due to the image being sent/receive being too large besides there being a memory leak?

I don't have time right now but when I do I am either figuring out how to hook a line-by-line debugger into Krita, or putting print statements between every line like a caveman.

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FInterpause%2Fauto-sd-krita%2Fissues%2F33%23issuecomment-1288884788&data=05%7C01%7Cahju%40equinor.com%7C9bab802133614539396908dab5b19af1%7C3aa4a235b6e248d591957fcf05b459b0%7C0%7C0%7C638022071669094161%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4C4leZ6KD517Gns4UyztyKovo7JW0QT7sZQQxwdo0Lc%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABSUVOI3MYN3ZQD2ARCDOJTWEZWDXANCNFSM6AAAAAARL2XVSM&data=05%7C01%7Cahju%40equinor.com%7C9bab802133614539396908dab5b19af1%7C3aa4a235b6e248d591957fcf05b459b0%7C0%7C0%7C638022071669094161%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5XTfYMBPbTUVXbXZrzefbX70Bu8xWYmGpvS6YyBNnOY%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>


The information contained in this message may be CONFIDENTIAL and is intended for the addressee only. Any unauthorized use, dissemination of the information or copying of this message is prohibited. If you are not the addressee, please notify the sender immediately by return e-mail and delete this message. Thank you

ethanfel commented 2 years ago

yeah same here. Crash with my flatpak Krita but works with the appimage and the previous version of the plugin.

TRGLN commented 2 years ago

Yes, I have the same problem((. The Krita crash when the size of the canvas is larger than 512x512, at the same time, the server continues to work, and accepts commands after restarting Krita. There was no such problem in previous versions. (rtx 3090, local installation)

Interpause commented 2 years ago

Okay, I have reproduced the crash on Linux and figured out where it may be occurring.

https://github.com/Interpause/auto-sd-krita/blob/730f4044da7b20eba49e5659ac51185c2b65e89c/krita_plugin/krita_diff/script.py#L144-L154

I am sure it has to do with the base64 decoding because:

This also implies its not a network issue, since Krita/Python is perfectly fine with receiving the JSON response containing the large image.

Notably, sending a large image (in img2img mode, sending the whole 4096x4096 region), doesn't result in a crash (at least on Linux, maybe it does on Windows). But it crashes as soon as it tries to decode the base64 image.

Interpause commented 2 years ago

I found kritacrash.log, installed debug symbols (as pr krita instructions with the portable version), and got crash log as here:

https://gist.github.com/ahjulstad/1370a670b178c91c10af95923c8f1aba

This was after three or four txt2img with a 2048x2048 canvas

My experience with reading stacktraces is about the level of someone who has gone for a few high school CTFs. But thanks anyways

Interpause commented 2 years ago

Try https://github.com/Interpause/auto-sd-krita/commit/a31fe44bb27fa75fe5a32135744804b207d0f72d. Based on the stacktrace & experimentation, it is likely one of the crashes is due to threads being killed early for who knows why. I also have no idea why the crash occurs at this exact line instead of at a random line:

https://github.com/Interpause/auto-sd-krita/blob/a31fe44bb27fa75fe5a32135744804b207d0f72d/krita_plugin/krita_diff/utils.py#L115

I also don't know why removing the image insertion somehow prevents the crash from happening at all. Is it some sort of race condition that is exacerbated by having a large canvas or selection region?

If it still crashes, set this to False:

https://github.com/Interpause/auto-sd-krita/blob/a31fe44bb27fa75fe5a32135744804b207d0f72d/krita_plugin/krita_diff/defaults.py#L26

And if it still crashes, it means there is more than one crash occurring here.

ahjulstad commented 2 years ago

Try a31fe44.

With this commit I have not been able to reproduce any errors. (Krita 5.1.1 and Windows 11) Everything seems to work fine, no crashes.

I might be observing a sporadic delay between clicking the button and the server starting that may or may not be related to issue #31 , but it always gets going after a few seconds.

Interpause commented 2 years ago

txt2img shouldn't have the delay, only img2img, inpaint & upscale. There is a small delay in capturing part of the canvas & then encoding it as base64, which I did not move into a separate thread so it is expected. It is also short enough that I probably won't fix it.

Also, it means most likely the crash you originally experienced was fixed at some point when I was revamping parts of the logic. However, it was replaced by a separate crash when I tried to implement threading.

I am closing the issue for now but re-open if there are any more crashes likely to be related to this.