frankier / skelshop

📺 📰 🧑‍💼 Toolkit for skeleton & face analysis of talking heads (e.g. news) videos 🧑‍💼 📰 📺
https://frankier.github.io/skelshop
MIT License
5 stars 1 forks source link

ffprobe-wrapper, testing? #27

Closed cstenkamp closed 3 years ago

cstenkamp commented 3 years ago

I want to introduce some tests, but I am completely unsure how to. First of all, I want to simply run the code on a very short video to see if everything simply runs without errors. For that, I have (so far) hosted the respective video on my website at http://cstenkamp.de/test_data/. I assume that is awful technique, but I don't know how to do that better - any ideas?

Other things

frankier commented 3 years ago

Could you separate these changes into different PRs to make them easier to review? Some bits are a fairly easy merge (e.g. tqdm - frame count for image dirs) whereas others are a bit more difficult and require a bit more discussion to get right, e.g. tests.

Can you keep a clean commit history with clear commit messages, please?

Did you figure out when ffprobe is needed yet? Do you know when we end up with an OpenCV not linked against FFMPEG? We should probably protect against this/insist upon linking against FFMPEG since it could cause other problems. If there is no problem with OpenCV when it is linked against ffmpeg, I think we should ditch ffprobe since we are pulling in more and more things to deal with video. I am actually in the process of pulling in another video library https://github.com/dmlc/decord (the reason is that hopefully we can avoid roundtripping video to the CPU most of the time) so it would be good to try and rationalize this around a minimal set of libraries to do the things we need. (i.e. we might eventually not need OpenCV.)

Those recommended threshold values are useful: (the link has just moved to https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/demo_not_quick_start.md ). There is some stuff for thresholding skeletons based on confidence values. e.g. https://github.com/frankier/skelshop/blob/8f6b972f2e5be647d7be59b15a45d51b137785d2/skelshop/face/pipe.py#L104 In fact the same thing is done in different ways all around the toolkit. Since this is something we are always wanting to do would it be reasonable to make this some kind of general purpose utility which is specifiable/usable uniformly across the whole toolkit? Do we really want to just threshold individual keypoints and not whole regions? How does that fit in with our overall use case (we might have different needs from OpenPose)?

I don't think it's a good idea to introduce an extra network dependency into tests, no. Are you sure we need tests yet though? Tests aren't universally good, and can introduce maintenance burden. They should be targeted toward protecting fragile/easy to break/really important/infrequently run code paths. An ex-colleague of mine recently wrote a blog post about this, which I generally subscribe to. Also end-to-end tests can be really fragile. Unit tests of really tricky code with lots of odd interacting parameters are example of a high value situation for tests (payoff versus effort).

Well I see that this is just a smoke test essentially which could be a nice thing. In that case I recommend we get some public domain footage (i.e. not Ellen DeGeneres) and take about < 30 frames and maybe make it around a shot change so that we test that too. Hopefully we can make it so all the assets are < 100kb and then we can just commit it in a fixtures directory. From my perspective I am happy to include this, but I do think we might benefit more from one of the "help wanted" tasks.

In general the playsticks command is mainly just for debugging, at least at the moment, so even though it's fun to watch the stick videos, we should try and avoid getting drawn into working too much on it. (I know myself that interactive/quick feedback programming is more fun but let's just build it up as much as we need to support other things.)

By the way, if you have Peter's Ellen zips you can just convert them, you don't need to re-run skeleton dumping. See: https://frankier.github.io/skelshop/cli/#conv . They are in the single-zip format.

frankier commented 3 years ago

Click includes a progress bar function: https://click.palletsprojects.com/en/7.x/api/#click.progressbar -- do we need to pull in tqdm? We can do if it's better somehow -- otherwise let's stick with click.

cstenkamp commented 3 years ago

Hm, yes I will re-do everything I did here. Actually I just wanted to dump what I added while i was (still) trying to understand your code, but you are right, this is many different things at once and not verbose and all.

I'll answer to your other things separately, but first of all regarding tests:

I totally agree with the post, but the gist of the post is definitely "make as few tests as necessary" - and I would disagree with a statement like "rather no tests than too many tests". Two things that got into my mind while reading:

So yes, maybe not hundreds of tests, but I think a few tests are immensely helpful.

And yes, what I wanted to have was simply along the lines of a smoke-detector, simply a sanity-check to know that I haven't destroyed anything when changing stuff, something that can definitely not be replaced with linters or the like.