Open Jean-Baptiste-Lasselle opened 2 years ago
I will try as soon as possible to propose a Dockerfile in a PR. Do you @scherroman have any plan to bring up an OCI Container image definition ?
Hey @Jean-Baptiste-Lasselle I love the idea of a standard Dockerfile for the project if it makes running mugen simpler and more flexible, hadn't thought of this! No stranger to dockerfiles so I'll play around with this next week. Would be happy to review one if you propose it as well.
I've been thinking a lot about how to speed up the creation process, right now it's a kind of run it and forget it for an hour or two type thing, but doesn't necessarily have to be that way. In particular the analysis of each randomly selected clip for scene cuts and text is quite time consuming and inefficient, and a fair number of perfectly good clips are thrown out as false positives. Switching over to pytesseract from tesserocr likely slowed down mugen slightly, as there's now disk writes/reads involved in the text detection, but whereas tesserocr was not working properly cross-platform, pytesseract was and has better maintenance.
This past week I spent some time writing tests for the detection functions, tweaking them to get the best results and researching/testing alternatives. For scene cut detection, TransNetV2 was quite impressive but slower, had some weaknesses with cuts at the very beginning or end of short clips, and not the smoothest installation. Though a Dockerfile could potentially help with more complex setup steps like that, so something to think about (they actually have a method to run the program from a dockerfile). PySceneDetect was not great. I also found that a larger number of false positives can be reduced by combing moviepy's scene detection (what we currently use) with ffprobe's libav scene detection at a low threshold, which is very fast but inaccurate on it's own. Of course that means running a second scene detection function when a clip with a cut is detected, which again slows down the process slightly.
So there's a balance here, I want to ensure mugen is fast, but first and foremost I want to ensure that that it's easy to install and use, maintainable, working as expected, and not throwing out good scenes. In the immediate term to speed things up i'll be looking to:
Enable performing any necessary analysis up front once per video, which will take some time up front but make creating music videos from the same sources blazingly fast afterwards. There has also been the suggestion in #27 of allowing using a gpu or multiple cores for the selection and analysis, something i'll be keeping in mind in relation to this.
Provide an easy way for users to manually specify exclusion zones for their videos and groups of videos to exclude opening/ending sequences and credits, which would take some extra manual effort up front but improve results and speed up creation overall by allowing us to remove the need for a text detection filter by default. Tesseract's text detection works decently, but there are too many examples of distorted or hand drawn credit sequences that aren't detected, and perfectly good scenes where it falsely detects text causing us to throw them out. Neither does it help in excluding credit-less opening and ending sequences in series and movies. I've thought about training my own credits detection model to help with this, but would be a little too far down the rabbit hole on that one for my liking at this point in time.
Description
I would like to run mugen in a container.
Rationale
Because it would be so much faster, and simpler t run. Because it could make think of how to scale up that service (scale up a Kubernetes deployment to 20 pods, and each of the 20 pods processes 2 seconds of the videos, in, the end all is put back together and returned to request issuer)
Alternatives
running conda in a Virtual Machine
Additional context