Open xirtus opened 2 years ago
Think your pasted code got mangled by the Markdown parser, maybe just link to where the code may be found?
Or if it is your own, use e.g. https://gist.github.com/
Think your pasted code got mangled by the Markdown parser, maybe just link to where the code may be found?
Or if it is your own, use e.g. https://gist.github.com/
Thank you, posted here, https://gist.github.com/xirtus/de9b76a79ecdab9ad2328cc865277cfa
also a few interesting projects for stable diffusion animation workflows working right now, most interestingly the top one Deforum which is well curated and appropriate: https://github.com/HelixNGC7293/DeforumStableDiffusionLocal https://github.com/amotile/stable-diffusion-studio https://github.com/thomsan/Deforum_Stable_Diffusion/blob/main/Deforum_Stable_Diffusion.ipynb
just realized my gist wasn't public so I made it public
The limit of 14 creations at a time is too few, the limit should be ±10,000 images. Animation is well documented in discodiffusion.com - below is a basic concept of 3d and 2d video warping with RAFT - with below code one can create a sequence of animatons from the following sequence prompt (from frame 0 to 48)
text_prompts = { 0: [ "a painting by of the exorcist girl on a bed by a priest in 1950s interior decor,", "A Kelley Jones art, Gil Kane art, Barry Windsor-Smith art, Steve Ditko, Joe Kubert art, Neal Adams, jack kirby art, Unreal Engine 3d", ], 48: [ "a painting of a teenage witch in a hood on a spiral staircase,", "Sam Keith art, Kelley Jones art, Gil Kane art, Barry Windsor-Smith art, Steve Ditko, Joe Kubert art, Neal Adams, jack kirby art,", ],
}
etc.
@markdown ####Animation Mode:
animation_mode = '3D' #@param ['None', '2D', '3D', 'Video Input'] {type:'string'}
@markdown For animation, you probably want to turn
cutn_batches
to 1 to make it quicker.@markdown ---
@markdown ####Video Input Settings:
if is_colab: video_init_path = "/content/drive/MyDrive/init.mp4" #@param {type: 'string'} else: video_init_path = "init.mp4" #@param {type: 'string'} extract_nth_frame = 2 #@param {type: 'number'} persistent_frame_output_in_batch_folder = True #@param {type: 'boolean'} video_init_seed_continuity = False #@param {type: 'boolean'}
@markdown #####Video Optical Flow Settings:
video_init_flow_warp = True #@param {type: 'boolean'}
Call optical flow from video frames and warp prev frame with flow
video_init_flow_blend = 0.999#@param {type: 'number'} #0 - take next frame, 1 - take prev warped frame video_init_check_consistency = False #Insert param here when ready video_init_blend_mode = "optical flow" #@param ['None', 'linear', 'optical flow']
Call optical flow from video frames and warp prev frame with flow
if animation_mode == "Video Input": if persistent_frame_output_in_batch_folder or (not is_colab): #suggested by Chris the Wizard#8082 at discord videoFramesFolder = f'{batchFolder}/videoFrames' else: videoFramesFolder = f'/content/videoFrames' createPath(videoFramesFolder) print(f"Exporting Video Frames (1 every {extract_nth_frame})...") try: for f in pathlib.Path(f'{videoFramesFolder}').glob('*.jpg'): f.unlink() except: print('') vf = f'select=not(mod(n\,{extract_nth_frame}))' if os.path.exists(video_init_path): subprocess.run(['ffmpeg', '-i', f'{video_init_path}', '-vf', f'{vf}', '-vsync', 'vfr', '-q:v', '2', '-loglevel', 'error', '-stats', f'{videoFramesFolder}/%04d.jpg'], stdout=subprocess.PIPE).stdout.decode('utf-8') else: print(f'\nWARNING!\n\nVideo not found: {video_init_path}.\nPlease check your video path.\n')
!ffmpeg -i {video_init_path} -vf {vf} -vsync vfr -q:v 2 -loglevel error -stats {videoFramesFolder}/%04d.jpg
@markdown ---
@markdown ####2D Animation Settings:
@markdown
zoom
is a multiplier of dimensions, 1 is no zoom.@markdown All rotations are provided in degrees.
key_frames = True #@param {type:"boolean"} max_frames = 10000#@param {type:"number"}
if animation_mode == "Video Input": max_frames = len(glob(f'{videoFramesFolder}/*.jpg'))
interp_spline = 'Linear' #Do not change, currently will not look good. param ['Linear','Quadratic','Cubic']{type:"string"} angle = "0:(0)"#@param {type:"string"} zoom = "0: (1), 10: (1.05)"#@param {type:"string"} translation_x = "0:(0),70:(0),71:(3.5),106:(0),141:(0),142:(-3.5),179:(0),213:(0),214:(3.5),250:(0),285:(0),286:(-3.5),322:(0),1666:(0),1667:(6),1675:(3),1676:(0),1809:(0),1810:(-6),1818:(-3),1819:(0),1953:(0),1954:(6),1962:(3),1963:(0),3949:(0),3950:(-6),3956:(-3),3957:(0),4091:(0),4092:(6),4099:(3),4100:(0),4234:(0),4235:(-6),4242:(-3),4243:(0)"#@param {type:"string"} translation_y = "0:(0),36:(0),37:(3.5),71:(0),105:(0),106:(-3.5),142:(0),178:(0),179:(3.5),214:(0),249:(0),250:(-3.5),286:(0),1523:(0),1524:(4),1533:(8),1534:(0),1882:(0),1882:(-4),1888:(-8),1889:(0),3806:(0),3807:(8),3814:(4),3815:(0),4161:(0),4162:(-8),4170:(-4),4171:(0)"#@param {type:"string"} translation_z = "0:(0),18:(0),37:(2.5),257:(1),322:(3.5),1425:(4),1444:(5),1453:(-6.5),1462:(5),2033:(5),2130:(4),2569:(3.5),2602:(0),2674:(-2),2889:(-2.2),3137:(-2.2),3173:(-2),3602:(-2.5),3709:(-3.5),3741:(-5),3885:(-5),3886:(8),3895:(0),3904:(-5),4313:(-4),4351:(0)"#@param {type:"string"} rotation_3d_x = "0:(0),321:(0),322:(0.007),892:(0.007),926:(-0.01),962:(0.01),998:(-0.01),1034:(0.01),1069:(-0.01),1105:(0.01),1140:(-0.01)),1176:(0.01),1212:(-0.01)),1247:(0.01),1282:(-0.01)),1319:(0.01),1354:(-0.01)),1389:(0.01),1425:(-0.01)),1461:(0)"#@param {type:"string"} rotation_3d_y = "0:(0),1461:(0),1462:(0.01),1532:(0.01),1533:(-0.01),1604:(-0.01),1605:(0.01),1675:(0.01),1676:(-0.01),1747:(-0.01),1748:(0.01),1818:(0.01),1819:(-0.01),1889:(-0.01),1889:(0.01),1960:(0.01),1961:(-0.01),2032:(-0.01),2033:(0)"#@param {type:"string"} rotation_3d_z = "0:(0),1603:(0),1604:(0.03),1622:(0.03),1623:(0),3743:(0),3744:(-0.01),3814:(-0.01),3815:(0.01),3885:(0.01),3886:(0),3903:(0),3904:(-0.01),3957:(-0.01),3958:(0.01),4028:(0.01),4029:(-0.01),4098:(-0.01),4099:(0.01),4170:(0.01),4171:(-0.01),4241:(-0.01),4242:(0.01),4312:(0.01),4313:(0)"#@param {type:"string"} midas_depth_model = "dpt_large"#@param {type:"string"} midas_weight = 0.3#@param {type:"number"} near_plane = 200#@param {type:"number"} far_plane = 10000#@param {type:"number"} fov = 40#@param {type:"number"} padding_mode = 'border'#@param {type:"string"} sampling_mode = 'bicubic'#@param {type:"string"}
======= TURBO MODE
@markdown ---
@markdown ####Turbo Mode (3D anim only):
@markdown (Starts after frame 10,) skips diffusion steps and just uses depth map to warp images for skipped frames.
@markdown Speeds up rendering by 2x-4x, and may improve image coherence between frames.
@markdown For different settings tuned for Turbo Mode, refer to the original Disco-Turbo Github: https://github.com/zippy731/disco-diffusion-turbo
turbo_mode = True #@param {type:"boolean"} turbo_steps = "3" #@param ["2","3","4","5","6"] {type:"string"} turbo_preroll = 10 # frames
insist turbo be used only w 3d anim.
if turbo_mode and animation_mode != '3D': print('=====') print('Turbo mode only available with 3D animations. Disabling Turbo.') print('=====') turbo_mode = False
@markdown ---
@markdown ####Coherency Settings:
@markdown
frame_scale
tries to guide the new frame to looking like the old one. A good default is 1500.frames_scale = 1500 #@param{type: 'integer'}
@markdown
frame_skip_steps
will blur the previous frame - higher values will flicker less but struggle to add enough new detail to zoom into.frames_skip_steps = '60%' #@param ['40%', '50%', '60%', '70%', '80%'] {type: 'string'}
@markdown ####Video Init Coherency Settings:
@markdown
frame_scale
tries to guide the new frame to looking like the old one. A good default is 1500.video_init_frames_scale = 15000 #@param{type: 'integer'}
@markdown
frame_skip_steps
will blur the previous frame - higher values will flicker less but struggle to add enough new detail to zoom into.video_init_frames_skip_steps = '70%' #@param ['40%', '50%', '60%', '70%', '80%'] {type: 'string'}
======= VR MODE
@markdown ---
@markdown ####VR Mode (3D anim only):
@markdown Enables stereo rendering of left/right eye views (supporting Turbo) which use a different (fish-eye) camera projection matrix.
@markdown Note the images you're prompting will work better if they have some inherent wide-angle aspect
@markdown The generated images will need to be combined into left/right videos. These can then be stitched into the VR180 format.
@markdown Google made the VR180 Creator tool but subsequently stopped supporting it. It's available for download in a few places including https://www.patrickgrunwald.de/vr180-creator-download
@markdown The tool is not only good for stitching (videos and photos) but also for adding the correct metadata into existing videos, which is needed for services like YouTube to identify the format correctly.
@markdown Watching YouTube VR videos isn't necessarily the easiest depending on your headset. For instance Oculus have a dedicated media studio and store which makes the files easier to access on a Quest https://creator.oculus.com/manage/mediastudio/
@markdown
@markdown The command to get ffmpeg to concat your frames for each eye is in the form:
ffmpeg -framerate 15 -i frame_%4d_l.png l.mp4
(repeat for r)vr_mode = False #@param {type:"boolean"}
@markdown
vr_eye_angle
is the y-axis rotation of the eyes towards the centervr_eye_angle = 0.5 #@param{type:"number"}
@markdown interpupillary distance (between the eyes)
vr_ipd = 5.0 #@param{type:"number"}
insist VR be used only w 3d anim.
if vr_mode and animation_mode != '3D': print('=====') print('VR mode only available with 3D animations. Disabling VR.') print('=====') vr_mode = False
def parse_key_frames(string, prompt_parser=None): """Given a string representing frame numbers paired with parameter values at that frame, return a dictionary with the frame numbers as keys and the parameter values as the values.
def get_inbetweens(key_frames, integer=False): """Given a dict with frame numbers as keys and a parameter value as values, return a pandas Series containing the value of the parameter at every frame from 0 to max_frames. Any values not provided in the input dict are calculated by linear interpolation between the values of the previous and next provided frames. If there is no previous provided frame, then the value is equal to the value of the next provided frame, or if there is no next provided frame, then the value is equal to the value of the previous provided frame. If no frames are provided, all frame values are NaN.
def split_prompts(prompts): prompt_series = pd.Series([np.nan for a in range(max_frames)]) for i, prompt in prompts.items(): prompt_series[i] = prompt
prompt_series = prompt_series.astype(str)
if key_frames: try: angle_series = get_inbetweens(parse_key_frames(angle)) except RuntimeError as e: print( "WARNING: You have selected to use key frames, but you have not " "formatted
angle
correctly for key frames.\n" "Attempting to interpretangle
as " f'"0: ({angle})"\n' "Please read the instructions to find out how to use key frames " "correctly.\n" ) angle = f"0: ({angle})" angle_series = get_inbetweens(parse_key_frames(angle))else: angle = float(angle) zoom = float(zoom) translation_x = float(translation_x) translation_y = float(translation_y) translation_z = float(translation_z) rotation_3d_x = float(rotation_3d_x) rotation_3d_y = float(rotation_3d_y) rotation_3d_z = float(rotation_3d_z)
then flow consistency
@title Generate optical flow and consistency maps
@markdown Run once per init video
if animation_mode == "Video Input": import gc