About GPU info - Githubissues

johndpope / MegaPortrait-hack

Using Claude Opus to reverse engineer code from MegaPortraits: One-shot Megapixel Neural Head Avatars

https://arxiv.org/abs/2207.07621

42 stars 7 forks source link

About GPU info #29

Closed samsara-ku closed 3 weeks ago

samsara-ku commented 3 weeks ago

Hi, I'm currently motivated to your nice work, and try to start up my own work based on your work.

Lately, I finish building up G_base codes and try to train it's code based on the VoxCeleb2.

However, I cannot train my network even with small batches (i.e. 4).

So I'm gonna ask you two things: 1) did you train your own network and 2) could you tell me a brief info about your GPU?

Actually, I'm using 4 V100 32GB, but there is OOM problem even with batch size 4.

johndpope commented 3 weeks ago

Use 512 x512 - will work. Theres an avg pool you can adjust/remove (can remember check my commits) to get the network to cycle - the reason I put everything into one single class (alongside the txt document in references and even diagrams) is I can just through it at chatgpt / Claude opus and ask it questions

johndpope commented 3 weeks ago

You can use my make it fast PR - it removes the warp / code operation. That was taking 70 seconds.

johndpope commented 3 weeks ago

i have a PR here to overhaul training - it needs some sanity testing. https://github.com/johndpope/MegaPortrait-hack/pull/33 i also have a problem with preprocessing data. https://github.com/johndpope/MegaPortrait-hack/issues/34

johndpope commented 3 weeks ago

did you kill any lingering running python training processes?

add to .zshrc

function kp() { ~/killer.sh python }

killer.sh


#!/usr/bin/env bash

# Author: Oleh Pshenychnyi
# Date: 13.02.2021
#
# Kill all processes matching a provided pattern.
#
# Usage:
#
# >> bash killer.sh celery
#
# or better to alias this script in your .bashrc/.zshrc
# so you can use it like:
#
# >> killer npm
# >> killer celery
# >> killer fuckingJava

victim_name=${1}

if [ "$victim_name" == "" ]
then
    echo "Nope! Gimme a victim name."
    exit
fi

output="$(ps ax | grep ${victim_name} | awk '{print $1,$3}')"
# at this point output looks like this:
# 254214 S
# 254215 S
# 254216 S
# 259206 S+
# 259207 S+

# we change internal field separator to use newline as a separator
_IFS=$IFS
IFS=$'\n'
pid_state_array=($output)
IFS=$_IFS

# pids to be killed
victim_pids=()

for pid_state in "${pid_state_array[@]}"; do
    pid_state=($pid_state)
    # we ignore the current process and its child
    if [ "${pid_state[0]}" != $$ ] && [ "${pid_state[1]}" != "S+" ]
    then
        victim_pids+=("${pid_state[0]}")
    fi
done

if [ "${#victim_pids[@]}" == 0 ]
then
    echo "Nothing found for '${victim_name}'."
    exit
fi

echo "Got them: ${victim_pids[@]}";
echo "$(kill -9 "${victim_pids[@]}" >/dev/null 2>&1)"
echo ".. and smashed!"a

johndpope commented 3 weeks ago

I merge the PR for consistency loss - looks like it's working. there's some preprocessing of videos - maybe takes > 10 mins for the 4 from junk folder. (I know you deleted these - there's a 40gb torrent on the readme with 512x512 - 30,000 videos.) I'm saving out some numpy arrays for faster subsequent loading.

Screenshot from 2024-06-04 22-30-25

pred_frame_281