Extraltodeus / depthmap2mask

Create masks out of depthmaps in img2img
355 stars 35 forks source link

Feature request: Mask depth control #3

Open AugmentedRealityCat opened 1 year ago

AugmentedRealityCat commented 1 year ago
  1. What should the feature do?

With this new feature it should be possible to determine at which depth the image should be masked, and at which depth it should be visible.

  1. How should that feature be controlled

We will call the portion of the image we want to keep visible a slice. The user should have control over where to cut the scene to obtain the slice he wants to keep.

  1. What are the parameters we need to apply that control as a user

The first parameter would be the near cutdistance. This would determine how far from the camera the image slice begins to be visible. Everything closer to the camera than the near cut distance would be cut out. The second parameter would be thefar cut distance. This would determine at which distance the visible slice of our image should be cut again, and everything further away simply becomes invisible.

  1. How should those parameters be presented to the user?

Since we already have a depthmap to deal with, it is logical to use it as a reference. It uses a 8 bit channel (values of 0 to 255) to determine distance. A distance of 0 should be the closest to the camera. A distance of 255 should be infinity - basically everything further away than the furthest distance measured. This means that for both near cut and far cut the user must provide a value between 0 and 255. Logically, the far cut value should always be higher than the near cut.

  1. How should those parameters be used by the system to obtain the desired results?

We will transform the near cut and far cut parameters into control parameters for a color adjustments procedure First we will invert the depthmap colors - This is just to make it easier to understand with black=0=close and white=255=far. Then we will make all the pixels that are darker than the near cut distance value completely black. This cuts the near part. Then we will make all the pixels that are brighter than the far cut distance value completely black as well. This cuts the far part. Then we will make all the pixels that are brighter than the near cut value but lower than the far cut completely white. This makes the slice we have just cut out completely opaque.

  1. How can we make this even better

Feathered distance: this would soften the cut. It would act like a blurred mask, but blurred in z-space, all by adjusting the masks colors. The parameter would be controlled as a 0-255 distance, but would be limited to a certain maximal value that make sense. Keep semi transparent: this checkbox would allow the user to keep the sliced portion in greyscale instead of forcing it to be opaque Semi-transparency normalization : This checkbox sub-parameter would only be used if keep semi-transparent is selected. This would normalize the greyscale gradient that has remained visible after the slicing procedure by making the darkest grey almost black, and the lightest grey almost white, and spreading all other greyscale values in between evenly.

Let me know if you'd like examples based on pictures - I'm doing this process manually right now so I am very familiar with it.

Extraltodeus commented 1 year ago

Can't you already do a big part of that with the "Contrasts cut level" slider? Or at least most of it? The "Turn the depthmap into absolute black/white" checkbox also gives the possibility to have some clearer depth cut.

For your near cut idea, I guess that you mean that some value clearer than X should be pure white while the rest is put to scale from 0 to 255?

AugmentedRealityCat commented 1 year ago

The Contrast Cut Level + Absolute B&W is similar, I agree, but it's not exactly the same as what I'm looking for as a user. I see there is potential in having z-depth derived gradients in the mask itself - you demonstrated that very well in your examples. And this feature should be kept. And even expanded ! But it's different from what most compositing software users will be looking for.

Normally, much like when you draw a custom mask shape in mask mode in the WebUI, the mask you get is completely white in the areas you want to keep/change, and completely black for the opposite, and the the shades of grey are only used to feather and anti-alias the mask. This feathering on the edges is similar to the mask blur function of the WebUI by the way.

A precisely controlled black and white mask with feathered edges is what I'd like to do directly in the WebUI, instead of having to go to photoshop to adjust the levels to get the mask I need.

Imagine you have a person in front of a car, with a tree and hills in the background.

What I want to do is create a mask just for the car, which is in the middle of the scene. Right now it is only possible to have either just the person (the front ) or the person AND the car (the front + the middle) but not just the car (the middle but NOT the front).

My proposition is to do that with two parameters: the near cut and the far cut. This defines where the object I want in my mask begins and where it ends, according to its distance from the camera. That distance happens to be encoded as color, so we can use simple color adjustment processes to achieve the results we are looking for.

The far cut is in fact potentially the same thing as your Constrast Cut Level. It works on the dark side of brightness curve. The near cut would work in a similar fashion, but for the bright parts of the image, cutting them to black as well. And everything in between would be 100% white. With, eventually I hope, the possibility to feather the edges.

I believe you are using OpenCV for python since I got an error message mentioning it (something about a bad argument for cv2.cvtColor - I'll send bug report if I can manage to reproduce it). If that is the case, then I'm convinced that library must have the functions we need to adjust the depthmap colors to get what we need.

When the depthmap is generated, is the information stored in 8 bit per channel (like in standard jpg images) or are they in 16 or even 32 bit ? Is there some specification data about it ? I'm asking because this could make a big difference in the quality of the mask if we keep that more precise data format for our color adjustments.

phills11 commented 1 year ago

I have a slightly different solution that also addresses the problem, see https://github.com/Extraltodeus/depthmap2mask/issues/26

I want better control of the mask, but entering a threshold value manually is very clunky. It's difficult to know what number is right. Is 100 too high? Is 150 too high? You have to use trial and error. To make it worse, you have to rerun the model every time, so it's not fast.

I think the better way to control the mask is to take advantage of the user mask. It can both be used to clip the mask area, and to automatically set the threshold value.

Ideally there'd be some way to mouse over the mask image and get the values directly, but even if you had that, entering the numbers manually is more work than just drawing a messy blob via the user mask pencil.