Closed cs-mshah closed 6 months ago
Can you give more information on how to pass the edeges? like is it an adjacency list or are they simply a list of connections.
Can you share it for this example:
export P="a DSLR photo of a Tiger writing a Letter on a Table"
export P1="a DSLR photo of a Tiger"
export P2="a DSLR photo of a Letter Scroll"
export P3="a DSLR photo of a Table"
export N23="a Letter Scroll on a Table"
export N13="a Tiger writing on a Table"
export N12="a Tiger writing a Letter Scroll"
export CD=0.
# Use different tags to avoid overwriting:
export TG="tiger_table"
python launch.py --config configs/gd-if.yaml --train --gpu 0 exp_root_dir="examples" use_timestamp=false tag=$TG system.loss.lambda_entropy=0. system.geometry.num_objects=3 system.prompt_processor.prompt="$P" system.prompt_processor.negative_prompt="ugly, bad anatomy, blurry, pixelated obscure, unnatural colors, poor lighting, dull, and unclear, cropped, lowres, low quality, artifacts, duplicate, morbid, mutilated, poorly drawn face, deformed, dehydrated, bad proportions" system.prompt_obj=[["$P1"],["$P2"],["$P3"]] system.prompt_obj_neg=[["$N23"],["$N13"],["$N12"]] system.geometry.sdf_center_dispersion=$CD system.guidance.guidance_scale=[50.,20.] system.guidance.guidance_scale_milestones=[2000,] system.optimizer.params.geometry.lr=0.01
export RP=$P", 4K high-resolution high-quality"
export RP1=$P1", 4K high-resolution high-quality"
export RP2=$P2", 4K high-resolution high-quality"
export RP3=$P3", 4K high-resolution high-quality"
python launch.py --config configs/gd-sd-refine.yaml --train --gpu 0 exp_root_dir="examples" use_timestamp=false tag=$TG system.loss.lambda_entropy=0. system.geometry.num_objects=3 system.prompt_processor.prompt="$RP" system.prompt_obj=[["$RP1"],["$RP2"],["$RP3"]] system.prompt_obj_neg=[["$N23"],["$N13"],["$N12"]] system.geometry.sdf_center_dispersion=$CD data.fovy_range=[70,90] data.eval_fovy_deg=90 resume=examples/gd-if/$TG/ckpts/last.ckpt
# Adjust data.fovy_range to avoid OOM.
On running, we get an error asking for edges, but there's also a LOG suggesting that edges don't need to be provided in a scene with two/three objects. This scene has 3 objects. Maybe change that LOG statement as well?
Thanks for the question. We will update this explanation in the usage instructions.
The system.edge_list
is an ordered list corresponding to the edge-wise prompt list system.prompt_global
. It simply describes which two objects, say $o_i$ and $o_j$ (i.e., edge [(i-1),(j-1)]
), should be rendered out together (i.e., edge rendering) when optimizing object $o_i$ as a pairwise-relationship constraint to $o_j$. Therefore, the length of system.edge_list
and system.prompt_global
should equal the number of objects.
For example, in the provided scene "tiger_table"
(and all other three-object scenes as well), system.edge_list
is by default set as a cyclic list [[0,1],[1,2],[0,2]]
. If there is no strong relationship between $o_i$ and $o_j$, just use "and" in the corresponding Pij
.
The error you get in the 3 object scene is because system.edge_list
is set by default but system.prompt_global
is not provided. Just add system.prompt_global=[["$N12"],["$N23"],["$N13"]]
(for the coarse stage) and it should be fine.
Here is the tiger example which I gave and the outputs:
export P="a DSLR photo of a Tiger writing a Letter on a Table"
export P1="a DSLR photo of a Tiger"
export P2="a DSLR photo of a Letter Scroll"
export P3="a DSLR photo of a Table"
export N23="a Letter Scroll on a Table"
export N13="a Tiger writing on a Table"
export N12="a Tiger writing a Letter Scroll"
export CD=0.
# Use different tags to avoid overwriting:
export TG="tiger_table"
# python launch.py --config configs/gd-if.yaml --train --gpu 0 exp_root_dir="examples" use_timestamp=false tag=$TG system.loss.lambda_entropy=0. system.geometry.num_objects=3 system.prompt_processor.prompt="$P" system.prompt_processor.negative_prompt="ugly, bad anatomy, blurry, pixelated obscure, unnatural colors, poor lighting, dull, and unclear, cropped, lowres, low quality, artifacts, duplicate, morbid, mutilated, poorly drawn face, deformed, dehydrated, bad proportions" system.prompt_obj=[["$P1"],["$P2"],["$P3"]] system.prompt_obj_neg=[["$N23"],["$N13"],["$N12"]] system.prompt_global=[["$N12"],["$N23"],["$N13"]] system.geometry.sdf_center_dispersion=$CD system.guidance.guidance_scale=[50.,20.] system.guidance.guidance_scale_milestones=[2000,] system.optimizer.params.geometry.lr=0.01
export RP=$P", 4K high-resolution high-quality"
export RP1=$P1", 4K high-resolution high-quality"
export RP2=$P2", 4K high-resolution high-quality"
export RP3=$P3", 4K high-resolution high-quality"
export RP12=$N12", 4K high-resolution high-quality"
export RP23=$N23", 4K high-resolution high-quality"
export RP13=$N13", 4K high-resolution high-quality"
python launch.py --config configs/gd-sd-refine.yaml --train --gpu 0 exp_root_dir="examples" use_timestamp=false tag=$TG system.loss.lambda_entropy=0. system.geometry.num_objects=3 system.prompt_processor.prompt="$RP" system.prompt_obj=[["$RP1"],["$RP2"],["$RP3"]] system.prompt_obj_neg=[["$N23"],["$N13"],["$N12"]] system.prompt_global=[["$RP12"],["$RP23"],["$RP13"]] system.geometry.sdf_center_dispersion=$CD data.fovy_range=[70,90] data.eval_fovy_deg=90 resume=examples/gd-if/$TG/ckpts/last.ckpt
# Adjust data.fovy_range to avoid OOM.
For the course stage:
https://github.com/GGGHSL/GraphDreamer/assets/56499208/bbe9264e-a239-4862-b57c-b23b38b92ece
For the fine stage:
https://github.com/GGGHSL/GraphDreamer/assets/56499208/aab06f93-0ca9-42af-b5f0-12e551a45088
It seems that the table didn't get generated. Also the letter is gone. Is there some issue in the script? Like the negative prompt or the edges? Can you give examples/document on how to use the negative prompts, how to pass edges. Also Since this is a tedious task to create the scripts, is there a GPT4 prompt which can create this? I believe that is one contribution? Is it just for decomposing into a graph or for making these scripts as well.
It would be great if you could provide examples for the following:
It would be great if some more information on the centres is also known. How to select them and give as input.
Thanks.
Thanks for your feedback. For generating scenes with >= 3 objects, it is better not to set system.geometry.sdf_center_dispersion
to 0.
(in your script, export CD=0.
). At the beginning, object are initialized as randomly centered SDF spheres, and the dispersion of the centers adjusted by (multiplying) a hyperparameter system.geometry.sdf_center_dispersion
. Therefore, setting CD=0.
means objects are completely overlapping, which is not a good starting point for optimization (as object number increasing). By default, system.geometry.sdf_center_dispersion=0.2
. Here are results we generated (the course stage) for the example:
export P="a Tiger writing a Letter Scroll on a Table". # Added a word 'Scroll'
export P1="a Tiger"
export P2="a Letter Scroll"
export P3="a Table"
export P12="a Tiger writing a Letter Scroll"
export P13="a Tiger writing on a Table"
export P23="a Letter Scroll on a Table"
Course stage:
https://github.com/GGGHSL/GraphDreamer/assets/39009560/4e5caa78-6b0c-487e-8e95-d1f35d69cb17
https://github.com/GGGHSL/GraphDreamer/assets/39009560/b94cb575-f702-4194-8c9e-8b7be655b625
https://github.com/GGGHSL/GraphDreamer/assets/39009560/5b1811ac-a500-440a-b1e0-0fe355f074cd
The second object "a Letter Scroll"
failed to appear as well, potentially because it is too thin and thus hard to distinct from the "Table" based on the predicted SDFs. Set system.loss.lambda_entropy
> 0.
(default) may help in such case, as it disencourages empty objects. Plus, Not sure if you have noticed this commit that fixed a bug on view-dependent prompting for objects.
I am trying to generate a scene based on the following prompt: "a Person sitting on a Chair, holding a Magic Wand in his right hand, positioned in front of a Fireplace, cartoon, blender". I modified the
wizard_study.sh
script. Here is the script:However there are some edge list issues. Have I missed something? I believe I've added the relevant edges and negative prompts where needed.