i spent yesterday going through the infer.py + va.py
i'm confused why the model doesn't actually call the def forward -> G_forward anywhere on the va.py model.
did someone else write this inference code? it seems over compliated...
these are the interactions with the model from infer.py
it seems like the G_forward_old - was an attempt to consolidate this logic.
the other thinking I'm not certain on is around megaportraits implementation -
"These losses are calculated using only foreground regions in
both predictions and the ground truth."
I'm attempting to achieve high fps / for recreating VASA paper.
the infer.py seems to hit around 14fps.
is the gbase - supposed to have the modnet in baked in so it's always extracting the masks?
did emo add the face parsing? could it be slowing things down a lot?
UPDATE - i idid find the ModNet in the paper -
https://github.com/johndpope/MegaPortrait-hack/issues/59
was there ever a megaportraits FPS benchmarking....I thought it could do inference in real time - or maybe its just VASA.
do these have alpha channels?
i spent yesterday going through the infer.py + va.py i'm confused why the model doesn't actually call the def forward -> G_forward anywhere on the va.py model. did someone else write this inference code? it seems over compliated...
these are the interactions with the model from infer.py
it seems like the G_forward_old - was an attempt to consolidate this logic.
the other thinking I'm not certain on is around megaportraits implementation -
"These losses are calculated using only foreground regions in both predictions and the ground truth."
I'm attempting to achieve high fps / for recreating VASA paper. the infer.py seems to hit around 14fps.
is the gbase - supposed to have the modnet in baked in so it's always extracting the masks? did emo add the face parsing? could it be slowing things down a lot? UPDATE - i idid find the ModNet in the paper - https://github.com/johndpope/MegaPortrait-hack/issues/59
was there ever a megaportraits FPS benchmarking....I thought it could do inference in real time - or maybe its just VASA.