Open Ulf3000 opened 9 months ago
Using the larger SAM model should be quite easy because the architecture is the same. I was doing comparisons at the time and if I remember correctly model files could just be exchanged. In the end the full SAM is far too big to deploy and not responsive enough IMO. With the point-and-click approach I really expect to get feedback within 0.5 seconds or so, anything longer feels bad.
DIS is different since it doesn't take points or boxes as input, although it could be combined to run after SAM and refine the result. I looked into a lot of alpha-matte solutions but ultimately there was not a lot of interest and personally the coarse SAM results are good enough for what I wanted to do.
I'm happy to support additions though, most of the work was all the glue and boilerplate to get things integrated into Krita, supporting additional models is comparatively easy if they have some kind of ONNX support already.
i tried to replace the models before writing the feature request but didnt have any luck , "number of channels doesnt match" error. I compiled the onnx models with single mask but didnt work.
i general i think fidelity over speed in this case , masking something by hand takes several minutes so the difference between 0,5s and several seconds is not a deal imo. These things dont add any time to an illustration process in the end. You would just carefully look at and analyze the illustration while the mask is created , something you would do anyways a few minutes later.
Subject selection in photoshop is also a bit slower (takes like 5 seconds on my pc with a highres image and 2 to 3 seconds on lower res images) , but the mask is really good in the end.
the mobile sam model works for most things but sometimes its visible that it just approximates when it creates a rounded mask on sharp corners.
edit: another cool feature would also be to return the alpha mask as a krita layer or mask instead of converting to selection
I think PS has some feedback though when it analyses the image, and selections are more or less instant? Don't have it around to try atm. I just think click with nothing happening for several seconds is bad. But a slower refinement step with better quality could work, or something like that.
I don't think the bigger SAM model improves mask precision. It has the rounded corner issue too, and at higher resolutions the fixed resolution of the model is a problem. It only helps in some cases with coherency (like when an object is obstructed by something else).
To improve the precision some ideas would be:
It requires some experimentation and can be time consuming. Or find & copy some existing solution that works.
another cool feature would also be to return the alpha mask as a krita layer or mask instead of converting to selection
You can convert between selection <-> mask any time. Select > Show Global Selection Mask is a nice feature, it creates a temporary mask layer for the current selection, and you can save it as permanent layer with 1 click.
Interested in seeing how KritAI (see what I did there?) could be used in workflow, to make changes to diagrams, patent drawings (such as adding numbers and referencing to comments) and other applications where someone needs to update or make changes. Think of how many companies don't have the original Photoshop / GIMP image file and just need to change the date, because 2024 happened. Currently the tools that are commercial doesn't seem to be as dynamic or accessible as Krita.
ahh ok , i didnt know it needs to be downscaled. yeah i checked other segmentation models through controlnet comfy but they all fail on sharp edges. still have to try dis though.
i also found https://github.com/SysCV/sam-hq?tab=readme-ov-file .. doesnt even seem to be much slower.
i also found https://github.com/SysCV/sam-hq?tab=readme-ov-file .. doesnt even seem to be much slower.
Wow, that looks really good!
The further link is also interesting https://github.com/lkeab/gaussian-grouping
i dont have a c++ dev environment, otherwise i would try to fork your addon. (But maybe im gonna set it up one of these days) heres also the onnx version of dis https://github.com/arashpirelly/isnet2onnx?tab=readme-ov-file
Note that it also fixes resolution to 1024x1024, so tiling or similar might still be needed.
Very recent alternative to DIS: https://huggingface.co/briaai/RMBG-1.4
Looks great and very much looking forward to it
any plans for an upgrade ?
Photoshop native for 2 decades. Just converted to Krita, and I'm sure its just the learning curve, but I arrived here seeking tips on better selections for the reasons listed above. 100% Im willing to run testing, just let me know.
As for the solutions presented. i'd love to have a refine selection button for that second slower process. Because you're right, sometimes it's not needed. But for complex selections, like the dress on a character, it often selects portions of the surrounding background area as well. This leads to issues when generating with krita-ai-diffusion.
I have yet to try that selection to mask idea you mentioned though, as that might allow the refinement I need.
Photoshop native for 2 decades. Just converted to Krita, and I'm sure its just the learning curve, but I arrived here seeking tips on better selections for the reasons listed above. 100% Im willing to run testing, just let me know.
As for the solutions presented. i'd love to have a refine selection button for that second slower process. Because you're right, sometimes it's not needed. But for complex selections, like the dress on a character, it often selects portions of the surrounding background area as well. This leads to issues when generating with krita-ai-diffusion.
I have yet to try that selection to mask idea you mentioned though, as that might allow the refinement I need.
if you want to do it manually just use marquee selection
Agreed These images are taken from the MobileSam official repo. It's accuracy is slightly behind the original SAM, worse if the image structure is more complex. As what Ulf3000 said, I wouldn't mind trading a few seconds of computation time with several minutes less manual work. I think its usefulness depends on the context, both speed and accuracy have their niches to shine.
Just to be sure I followed this correctly: I am able to exchange the models to get better segmentation? Sorry for being a bit behind the curve.
hey man you made something amazing , the reason i still mainly use photoshop is partly becasue of the object selection tool. This is an important addition to kritas functionality.
But it would be nice if we could choose better models and /or methods , like the bigger sam models of course but maybe also https://github.com/xuebinqin/DIS
I querying here becasue your addon is already really well made with the additional tool buttons and tool options. Really deliberate. No need for yet another addon.