WASasquatch / PPF_Noise_ComfyUI

Perlin Power Fractal Noisey Latents for ComfyUI
MIT License
23 stars 2 forks source link

The exponent isn't used by default...? #3

Open tommyettinger opened 1 year ago

tommyettinger commented 1 year ago

Am I reading this right? The amplitude starts at 1.0, and is repeatedly multiplied by persistence. The default persistence is 1.0, which means amplitude stays at 1.0 through all those multiplications. When noise is generated per-channel with

                    noise_value_r = noise(nx + X, ny + Y, nz + Z, p) * amplitude ** exponent

The operator precedence in Python raises amplitude to the power of exponent first, and only then is that result multiplied by the noise() call. Because by default, amplitude stays at 1.0, and 1.0 to any non-NaN power is 1.0, the exponent is simply ignored unless persistence has been set to something other than 1.0 .

I am guessing this is not the intended behavior. Without exponentiation involved, I would think this isn't actually different from Perlin noise with Fractal Brownian Motion (the most common way I've seen Perlin noise), and I'd suggest changing the default persistence so this does something more-or-less novel by default.

I'd also strongly suggest clarifying the range calls to noise() can take, since there's no single standard across noise implementations, and exponentiation would be very different if it can apply to negative and positive numbers, rather than just positive ones. It looks like your noise() can return values between -1 and 1, which is I believe as Ken Perlin originally wrote it, but that seems a little curious to use for RGBA channels. In the context of this generator, does it make sense to have a negative value for red, green, or blue? Should it clamp values, should the noise be scaled and shifted to the 0-1 range, or should the noise be left alone? Typically, with 8 octaves of Perlin noise, the results are much more frequent near the center of the range (for noise between -1 and 1, that would be around 0) than they are at the most extreme values. Clamping such values would produce 0 or a low value most often by far, which could be optimal in this case because high outliers would be rare. Scaling and shifting the noise to the 0 to 1 range would put most results near 0.5 . Perlin and more recently Schlick have published good "bias" and "gain" functions that can be used to emphasize or de-emphasize extreme values. Barron published a convenient micro-paper, https://arxiv.org/abs/2010.09714 , for a way to control the emphasis even more using two parameters; this applies to values in the 0 to 1 range and produces results in that range as well. If you use any of these parameterized emphasis controls, you can expose the parameters to users as well! That is something I feel isn't present enough when noise is provided by libraries, and is very powerful when someone needs really tight control.

Thanks for thinking about ways to improve AI art! It's great to see how code has become so influential to this new field. It would be interesting if someone could figure out just how and why using Perlin (power fractal) noise is more effective than whatever Comfy uses without this. I'm guessing that since the distribution of Perlin noise is quite different from, say, white noise, if Comfy used white noise it would have more random outliers relative to any form of multi-octave Perlin (or Simplex) noise.

sober54 commented 1 year ago

When have an alpha channel, it can generate images like this 2 1-rMada Merge - SD 2 1 768_v7 0 safetensors_00144 ComfyUI_00261

WASasquatch commented 1 year ago

Regarding the first bit, isn't that just subjective to your settings? You could have persistence be 1.0, or not. Whether you'd want that influence is up to you? I believe in other implementations of perlin power fractals like Terragen do the same according to the advice I got.

The range of parameters is defined by INPUT_TYPES, is it not?

If you have improvements, I am more then open to PRs.

To be totally honest, I have zero math experience. I have dysclaculia and can't even retain my times tables (I can't manipulate numbers, its all visual in my head). This is all entirely out of my scope and endeavor done entirely on a programmic level with little understanding of the math involved. 🥺

tommyettinger commented 1 year ago

Hey, no problem, my best friend in middle school also had/has dyscalculia. Indeed, settings do affect this a lot, and I'm wondering if it would be better to default to higher or lower than 1.0 to show the "power" part of "Perlin power fractal" out-of-the-box. There's sorta a standard system for how fractal noise is typically implemented, but it is not really completely standardized, especially regarding the terms used. The term "lacunarity" hardly ever shows up outside the context of noise; as far as I understand it, "lacuna" is a Latin root for "crescent shape" and it refers to the shape of a graph of the different octaves' frequencies. That is, if you view it like a crescent moon that's getting brighter (waxing), the brightest it can get would be graphed as a straight diagonal line (where each octave has equal frequency) and the most crescent-shaped would be like a sliver of light at the edge of the moon (where the earlier octaves have nearly no change in frequency and the last one suddenly curves up sharply to a higher frequency).

The difference here seems to be that you have lacunarity but not "gain" in the way I'm used to seeing it, and I'm not entirely sure how that affects the results. Gain affects the amount each octave contributes to the result, and also changes the scaling at the end. To make matters more confusing, I've seen the term "persistence" used in various places but not in conjunction with "gain," so they might be synonyms. I think part of my puzzlement has to do with how I'm used to seeing gain (I use 0.5 gain by default) and lacunarity (I use 2.0 by default); I suspect if I used 1.0 for either it would change my results quite a bit. I generally use gain == 1.0 / lacunarity, but there are good reasons to change it too.

It would probably help if I made some visual examples, since this is all... pretty abstract. I'll edit this once I have some more world maps made with noise.

EDIT: so here are some world maps made with not-too-different parameters that look quite different.

This is my default that I use, with lacunarity=2.0 and gain=0.5 (which I think may be related to persistence here):

lacunarity2-gain0_5

The noise dissolves almost completely if I raise gain to 1.0, keeping lacunarity at 2.0:

lacunarity2-gain1

And if both lacunarity and gain are 1.0, all fine detail just disappears (look at the polar ice caps to see how they're very simple and basic):

lacunarity1-gain1

I really have no idea whatsoever which of these is optimal for noise when used for AI content generation! I think you mentioned high-frequency Perlin noise at one point; there is a definite difference between that and ordinary white noise, and that could explain why this is an improvement. The difference is simply that multiple-octave Perlin noise doesn't produce really high or really low values very often, and in an AI context this could mean that "weird looking sections" aren't as common, if those sections are caused by high or low noise values. However, if you wanted very chaotic, "creatively put together" AI art, you can also use any kind of noise and emphasize the highs and lows, such as by getting the cube root of any noise value or calling one of many types of "bias function" or "gain function" (even on white noise). I don't have any idea what that would produce other than working in the opposite direction as what this noise project does. If this turns out to be working just by changing the distribution from uniform white noise to centrally-favoring Perlin noise, then there's a chance that this could be sped up a lot by using some redistribution technique on white noise, since you can match the distribution of Perlin noise pretty closely with the right distribution, and if the noise was very-high-frequency, the property of Perlin noise where nearby results are similar won't matter if it isn't present.

sober54 commented 1 year ago

I tend to believe that this kind of noise can bring about positive changes. It seems that by controlling the noise, we can also control these changes. When I first started using noise as a latent input, I experimented with various sources and nodes of noise, and indeed, adjusting the parameters resulted in some changes in color and detail ,I still have these pictures saved, and they should be able to be reproduced in ComfyUI by use this node image

动漫-CoMix_v2 0 超美2 5D_v2 0 safetensors_00014 动漫-CoMix_v2 0 超美2 5D_v2 0 safetensors_00013 动漫-CoMix_v2 0 超美2 5D_v2 0 safetensors_00012 动漫-CoMix_v2 0 超美2 5D_v2 0 safetensors_00011 动漫-CoMix_v2 0 超美2 5D_v2 0 safetensors_00010 动漫-CoMix_v2 0 超美2 5D_v2 0 safetensors_00007

image 动漫-2 5-AWPainting_v1 1 safetensors_00006 动漫-2 5-AWPainting_v1 1 safetensors_00004 动漫-2 5-AWPainting_v1 1 safetensors_00002 动漫-2 5-AWPainting_v1 1 safetensors_00009 动漫-2 5-AWPainting_v1 1 safetensors_00008

Afterward, I started experimenting with blending two types of noise as inputs. Sometimes, by adjusting the parameters, they can help fix bad images

动漫-2 5D动漫V1 0_V1 0 safetensors_00216 动漫-2 5D动漫V1 0_V1 0 safetensors_00215 动漫-2 5D动漫V1 0_V1 0 safetensors_00214

Afterward, I shared this idea with the gentleman, and upon seeing your response, I started to doubt my findings. Just now, I also reached out to the developers in my country to see if this kind of control is feasible. He is the creator of Lycoris, but he hasn't replied to me yet.

tommyettinger commented 1 year ago

I don't doubt what you've found, I'm just not sure if there might be a significantly faster shortcut to do the same thing! Normally very-high-frequency noise loses the main property of continuous noise, that nearby "pixels" (or whatever cells the noise is queried with) will have nearby or similar values more often than not. But, it retains the potentially-very-useful property that it is distributed differently from uniform white noise, and it seems to me like maybe the distribution could explain why high-frequency Perlin noise works well. I don't have the background in signals, statistics, or information theory that might be helpful in exactly replicating the distribution of this specific type of Perlin noise... but thankfully this sort of thing can just as often be eyeballed and get close! It's also to me seeming like even roughly similar distributions might do almost as well or even better. Or, maybe it would work to only use one octave of Perlin Noise for speed purposes, but change its distribution to more closely imitate the 8-or-9-octave version's distribution.

The simplest implementation of a function that redistributes toward the center (that I can think of) is logit, which is pretty much log(p / (1 - p)) where p is a random value to redistribute, starting between 0 and 1 exclusive and ending with any number. Logit can technically produce any number, but it's much more likely to produce results centered on and close to 0. Logit isn't amazingly fast or anything, but it's a one-liner so it's useful when just seeing if something works. Multiplying the result of the log() call by a larger number produces a wider range, and dividing it by a larger number produces a range more closely centered on 0.

sober54 commented 1 year ago

Sir, I arrived at my conclusions based on your noise theory combined with what I observed. This made me reevaluate the images I generated and started to" doubt my findings" might only be changes in color. So, I began to feel uncertain about my discoveries because they were generated by blending default noise. I previously thought that the two types of noise had differences in 'density' (which is why I always adjusted some parameters in an extreme manner). image However, based on your mathematical theory, it seems that this concept doesn't exist. That's why I tried to explain my findings to you and Mr. Was, as he has invested a lot of effort and time in this matter. If it doesn't bring about significant changes, I will feel very guilty

WASasquatch commented 1 year ago

Hey, no problem, my best friend in middle school also had/has dyscalculia. Indeed, settings do affect this a lot, and I'm wondering if it would be better to default to higher or lower than 1.0 to show the "power" part of "Perlin power fractal" out-of-the-box. There's sorta a standard system for how fractal noise is typically implemented, but it is not really completely standardized, especially regarding the terms used. The term "lacunarity" hardly ever shows up outside the context of noise; as far as I understand it, "lacuna" is a Latin root for "crescent shape" and it refers to the shape of a graph of the different octaves' frequencies. That is, if you view it like a crescent moon that's getting brighter (waxing), the brightest it can get would be graphed as a straight diagonal line (where each octave has equal frequency) and the most crescent-shaped would be like a sliver of light at the edge of the moon (where the earlier octaves have nearly no change in frequency and the last one suddenly curves up sharply to a higher frequency).

The difference here seems to be that you have lacunarity but not "gain" in the way I'm used to seeing it, and I'm not entirely sure how that affects the results. Gain affects the amount each octave contributes to the result, and also changes the scaling at the end. To make matters more confusing, I've seen the term "persistence" used in various places but not in conjunction with "gain," so they might be synonyms. I think part of my puzzlement has to do with how I'm used to seeing gain (I use 0.5 gain by default) and lacunarity (I use 2.0 by default); I suspect if I used 1.0 for either it would change my results quite a bit. I generally use gain == 1.0 / lacunarity, but there are good reasons to change it too.

It would probably help if I made some visual examples, since this is all... pretty abstract. I'll edit this once I have some more world maps made with noise.

EDIT: so here are some world maps made with not-too-different parameters that look quite different.

This is my default that I use, with lacunarity=2.0 and gain=0.5 (which I think may be related to persistence here):

lacunarity2-gain0_5

The noise dissolves almost completely if I raise gain to 1.0, keeping lacunarity at 2.0:

lacunarity2-gain1

And if both lacunarity and gain are 1.0, all fine detail just disappears (look at the polar ice caps to see how they're very simple and basic):

lacunarity1-gain1

I really have no idea whatsoever which of these is optimal for noise when used for AI content generation! I think you mentioned high-frequency Perlin noise at one point; there is a definite difference between that and ordinary white noise, and that could explain why this is an improvement. The difference is simply that multiple-octave Perlin noise doesn't produce really high or really low values very often, and in an AI context this could mean that "weird looking sections" aren't as common, if those sections are caused by high or low noise values. However, if you wanted very chaotic, "creatively put together" AI art, you can also use any kind of noise and emphasize the highs and lows, such as by getting the cube root of any noise value or calling one of many types of "bias function" or "gain function" (even on white noise). I don't have any idea what that would produce other than working in the opposite direction as what this noise project does. If this turns out to be working just by changing the distribution from uniform white noise to centrally-favoring Perlin noise, then there's a chance that this could be sped up a lot by using some redistribution technique on white noise, since you can match the distribution of Perlin noise pretty closely with the right distribution, and if the noise was very-high-frequency, the property of Perlin noise where nearby results are similar won't matter if it isn't present.

Where is "gain" in any Perlin Power Fractals Noise? I use it in Terragen, Blender, and 3ds max, and have never ever seen a gain. It's just a PF. You can make it soft, or hard. That is the nature of a PF. It doesn't need to be rough to be "power", as it still contains several octaves and is not Perlin, single octave, scales etc.

Additionally no power fractal really show blow out to nothing like your examples. If it gives you blank, dissolves into nothing, it's honestly broken, like mine can do with wrong set of settings too. Regardless of settings there should be noise, or the ranges are wrong or settings corresponding to blown out range in the accumulation.

High frequency is in regards to raster, not wavegraph noise. High frequency details refers to the noisyness of details. For example, on grainy details on a image you could use a high pass to accentuate high frequency details by shadowing their boundaries and brightening their mass. . With high octaves and low scales, you get was is called high frequency details, or essentially noise. Even in raster white noise, and all power noises are used for high frequency modulation, because it's super high frequency (per pixel) "contrasting differences". Low frequency noise would larger blobs taking more pixels to define a texture, with less noise.


I am honestly not sure what you're entirely on about. The perlin power fractal is a basic algorithm, that then can be implemented various ways, with various parameters exposed, or not. What they influence, is still, PPF. It's suddenly just "not" because of settings used.

And you went from talking about exponent, something that again, can be flavored anyway, to lucanarity with no real cohesion on your point. Examples of random noise really doesn't mean much, and perlin as is, is implemented differently in different systems. Some in fact don't even have a PPF and let you manually layer and accumulate into a PPF, and it's still a PPF as all that is is octaves used for scaling the noise, whether rough or smooth. That's what makes it a "power fractal" is is simply octaves and scale, really. Same for other similar Power type algorithms that are to the Power of.

If you have tangible fixes to the formula, I'm interested to hear, but not semantics comparing different programs, where other DCC already implement it different program to program.


Sir, I arrived at my conclusions based on your noise theory combined with what I observed. This made me reevaluate the images I generated and started to" doubt my findings" might only be changes in color. So, I began to feel uncertain about my discoveries because they were generated by blending default noise. I previously thought that the two types of noise had differences in 'density' (which is why I always adjusted some parameters in an extreme manner). image However, based on your mathematical theory, it seems that this concept doesn't exist. That's why I tried to explain my findings to you and Mr. Was, as he has invested a lot of effort and time in this matter. If it doesn't bring about significant changes, I will feel very guilty

One thing you do is mix base noise, you need to amplify this noise. Inject it with Blenderneko's injection node, or the blend node that comes with it. Try 25 strength. You will total changes in your generations. Whether mixing with base comfy noise or add_noise disabled.

WASasquatch commented 1 year ago

Sir, I arrived at my conclusions based on your noise theory combined with what I observed. This made me reevaluate the images I generated and started to" doubt my findings" might only be changes in color. So, I began to feel uncertain about my discoveries because they were generated by blending default noise. I previously thought that the two types of noise had differences in 'density' (which is why I always adjusted some parameters in an extreme manner). image However, based on your mathematical theory, it seems that this concept doesn't exist. That's why I tried to explain my findings to you and Mr. Was, as he has invested a lot of effort and time in this matter. If it doesn't bring about significant changes, I will feel very guilty

Here is an example of properly utilizes the noise for influence. Both images use PPF, the bottom, injects more of it.

image ComfyUI_03758_ ComfyUI_03774_

(Base brightness) ComfyUI_03776_ (Darkening the noise) ComfyUI_03784_

Same seeds for sampling, different latent input, with better strength. This why blenderneko has the get sigma node to get the strength from the modal, but not sure if that strength works for this noise and been doing it manually and finding a good strength.

This very much does work and completely different gens, and looks then base ComfyUI noise, and is noticeably more sharp and detailed. It also listens to your prompt better where SDXL base comfyui noise has a real problem with listening to prompts.

WASasquatch commented 1 year ago

Here is an example of injecting the noise by increasing strength in Blend Latents (PPF Noise) by 0.1 each prompt run. It's reversed, so bottom last is 0.0 strength. You can definitely get major influence of gens from add_noise disabled on samplers that inject noise after first step, or with add_noise enabled on any sampler.

image

sober54 commented 1 year ago

Sir, I arrived at my conclusions based on your noise theory combined with what I observed. This made me reevaluate the images I generated and started to" doubt my findings" might only be changes in color. So, I began to feel uncertain about my discoveries because they were generated by blending default noise. I previously thought that the two types of noise had differences in 'density' (which is why I always adjusted some parameters in an extreme manner). image However, based on your mathematical theory, it seems that this concept doesn't exist. That's why I tried to explain my findings to you and Mr. Was, as he has invested a lot of effort and time in this matter. If it doesn't bring about significant changes, I will feel very guilty

Here is an example of properly utilizes the noise for influence. Both images use PPF, the bottom, injects more of it.

image ComfyUI_03758_ ComfyUI_03774_

(Base brightness) ComfyUI_03776_ (Darkening the noise) ComfyUI_03784_

Same seeds for sampling, different latent input, with better strength. This why blenderneko has the get sigma node to get the strength from the modal, but not sure if that strength works for this noise and been doing it manually and finding a good strength.

This very much does work and completely different gens, and looks then base ComfyUI noise, and is noticeably more sharp and detailed. It also listens to your prompt better where SDXL base comfyui noise has a real problem with listening to prompts.

Regardless of whether it differs from the default noise, I will use this type of noise due to its rich and controllable parameters. After trying everything, it is the only solution that allows me to generate high-quality images with just 20 iterations. So, if I haven't caused you to waste time on meaningless things, please continue developing it. I will be the first to download and use it extensively

sober54 commented 1 year ago

And once again, thank you, cyber Buddha