ManimCommunity / manim-voiceover

Manim plugin for all things voiceover
https://voiceover.manim.community/en/stable
MIT License
186 stars 25 forks source link

Manim crashes when using voiceover with option -s (==save_last_frame) #68

Closed mkuehne-git closed 1 year ago

mkuehne-git commented 1 year ago

Description of bug / unexpected behavior

I have a scene file manim-voiceover-issue.py using voice over. If I render it with the -s option (== save_last_frame) an exception is thrown (see Log section).

Root Cause

The configuration item config.output_file is not yet set when write_subcaption_file is invoked. In case of -s it will be set at the very end of the processing chain. If you render a Scene instead of VoiceoverScene the self.subcaption is None, avoiding the issue.

BTW: I am wondering how config.output_file can safely be configured by the user, since it is simply overwritten by the code.

Solution Proposal

I have changed the VoiceoverScene as follows, and this change seems to fix the issue for me.

class VoiceoverScene(Scene):
    """A scene class that can be used to add voiceover to a scene."""

    speech_service: SpeechService
    current_tracker: Optional[VoiceoverTracker]
    create_subcaption: bool
    create_script: bool

    def set_speech_service(
        self,
        speech_service: SpeechService,
        create_subcaption: bool = True,
    ) -> None:
        """Sets the speech service to be used for the voiceover. This method
        should be called before adding any voiceover to the scene.

        Args:
            speech_service (SpeechService): The speech service to be used.
            create_subcaption (bool, optional): Whether to create subcaptions for the scene. Defaults to True. If `config.save_last_frame` is True, the argument is
            ignored and no subcaptions will be created.
        """
        self.speech_service = speech_service
        self.current_tracker = None
        if config.save_last_frame:
            self.create_subcaption = False
        else:
            self.create_subcaption = create_subcaption

Expected behavior

No exception shall be thrown. And the final image shall be rendered.

It is even questionable if the mp3 should be created.

(.venv-3.9) [mk@archlinux media]$ tree
.
|-- images
|   `-- manim-voiceover-issue
`-- voiceovers
    |-- cache.json
    `-- this-circle-is-drawn-as-i-speak-73a82f70.mp3

4 directories, 2 files

How to reproduce the issue

Run the following scene with manim -pql manim-voiceover-issue.py VoiceScene -s -v DEBUG.

Code for reproducing the problem ```py from manim import * from manim_voiceover import VoiceoverScene from manim_voiceover.services.gtts import GTTSService config.disable_caching = True class VoiceScene(VoiceoverScene): def construct(self): self.set_speech_service(GTTSService()) with self.voiceover(text="This circle is drawn as I speak."): self.play(Create(Circle())) self.wait() class StandardScene(Scene): def construct(self): config.save_last_frame=True self.play(Create(Circle())) self.wait() ```

Additional media files

Images/GIFs

Logs

Terminal output Render with `-s`, exception occurs ``` (.venv-3.9) [mk@archlinux gist]$ manim -pql manim-voiceover-issue.py VoiceScene -s -v DEBUG Manim Community v0.17.3 [09/04/23 15:43:41] DEBUG Skipping animation 0 cairo_renderer.py:63 DEBUG List of the first few animation hashes of the scene: [None] cairo_renderer.py:87 DEBUG Animation with empty mobject animation.py:174 DEBUG Skipping animation 1 cairo_renderer.py:63 DEBUG List of the first few animation hashes of the scene: [None, None] cairo_renderer.py:87 DEBUG Animation with empty mobject animation.py:174 DEBUG Skipping animation 2 cairo_renderer.py:63 DEBUG List of the first few animation hashes of the scene: [None, None, None] cairo_renderer.py:87 ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/mk/dev/manim/.venv-3.9/lib/python3.9/site-packages/manim/cli/render/commands.py:115 in │ │ render │ │ │ │ 112 │ │ │ try: │ │ 113 │ │ │ │ with tempconfig({}): │ │ 114 │ │ │ │ │ scene = SceneClass() │ │ ❱ 115 │ │ │ │ │ scene.render() │ │ 116 │ │ │ except Exception: │ │ 117 │ │ │ │ error_console.print_exception() │ │ 118 │ │ │ │ sys.exit(1) │ │ │ │ /home/mk/dev/manim/.venv-3.9/lib/python3.9/site-packages/manim/scene/scene.py:233 in render │ │ │ │ 230 │ │ │ return True │ │ 231 │ │ self.tear_down() │ │ 232 │ │ # We have to reset these settings in case of multiple renders. │ │ ❱ 233 │ │ self.renderer.scene_finished(self) │ │ 234 │ │ │ │ 235 │ │ # Show info only if animations are rendered or to get image │ │ 236 │ │ if ( │ │ │ │ /home/mk/dev/manim/.venv-3.9/lib/python3.9/site-packages/manim/renderer/cairo_renderer.py:259 in │ │ scene_finished │ │ │ │ 256 │ def scene_finished(self, scene): │ │ 257 │ │ # If no animations in scene, render an image instead │ │ 258 │ │ if self.num_plays: │ │ ❱ 259 │ │ │ self.file_writer.finish() │ │ 260 │ │ elif config.write_to_movie: │ │ 261 │ │ │ config.save_last_frame = True │ │ 262 │ │ │ config.write_to_movie = False │ │ │ │ /home/mk/dev/manim/.venv-3.9/lib/python3.9/site-packages/manim/scene/scene_file_writer.py:468 in │ │ finish │ │ │ │ 465 │ │ │ target_dir = self.image_file_path.parent / self.image_file_path.stem │ │ 466 │ │ │ logger.info("\n%i images ready at %s\n", self.frame_count, str(target_dir)) │ │ 467 │ │ if self.subcaptions: │ │ ❱ 468 │ │ │ self.write_subcaption_file() │ │ 469 │ │ │ 470 │ def open_movie_pipe(self, file_path=None): │ │ 471 │ │ """ │ │ │ │ /home/mk/dev/manim/.venv-3.9/lib/python3.9/site-packages/manim/scene/scene_file_writer.py:729 in │ │ write_subcaption_file │ │ │ │ 726 │ │ │ 727 │ def write_subcaption_file(self): │ │ 728 │ │ """Writes the subcaption file.""" │ │ ❱ 729 │ │ subcaption_file = Path(config.output_file).with_suffix(".srt") │ │ 730 │ │ subcaption_file.write_text(srt.compose(self.subcaptions), encoding="utf-8") │ │ 731 │ │ logger.info(f"Subcaption file has been written as {subcaption_file}") │ │ 732 │ │ │ │ /usr/lib/python3.9/pathlib.py:1082 in __new__ │ │ │ │ 1079 │ def __new__(cls, *args, **kwargs): │ │ 1080 │ │ if cls is Path: │ │ 1081 │ │ │ cls = WindowsPath if os.name == 'nt' else PosixPath │ │ ❱ 1082 │ │ self = cls._from_parts(args, init=False) │ │ 1083 │ │ if not self._flavour.is_supported: │ │ 1084 │ │ │ raise NotImplementedError("cannot instantiate %r on your system" │ │ 1085 │ │ │ │ │ │ │ │ │ % (cls.__name__,)) │ │ │ │ /usr/lib/python3.9/pathlib.py:707 in _from_parts │ │ │ │ 704 │ │ # We need to call _parse_args on the instance, so as to get the │ │ 705 │ │ # right flavour. │ │ 706 │ │ self = object.__new__(cls) │ │ ❱ 707 │ │ drv, root, parts = self._parse_args(args) │ │ 708 │ │ self._drv = drv │ │ 709 │ │ self._root = root │ │ 710 │ │ self._parts = parts │ │ │ │ /usr/lib/python3.9/pathlib.py:691 in _parse_args │ │ │ │ 688 │ │ │ if isinstance(a, PurePath): │ │ 689 │ │ │ │ parts += a._parts │ │ 690 │ │ │ else: │ │ ❱ 691 │ │ │ │ a = os.fspath(a) │ │ 692 │ │ │ │ if isinstance(a, str): │ │ 693 │ │ │ │ │ # Force-cast str subclasses to str (issue #21127) │ │ 694 │ │ │ │ │ parts.append(str(a)) │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ TypeError: expected str, bytes or os.PathLike object, not NoneType (.venv-3.9) [mk@archlinux gist]$ ``` Render without `-s`, no exception as excpected. ``` .venv-3.9) [mk@archlinux gist]$ manim -pql manim-voiceover-issue.py VoiceScene -v DEBUG Manim Community v0.17.3 [09/04/23 15:46:26] INFO Log file will be saved in /home/mk/dev/manim/ecc/gist/media/logs/manim-voiceover-issue_VoiceScene.log logger_utils.py:170 INFO Caching disabled. cairo_renderer.py:68 DEBUG List of the first few animation hashes of the scene: ['uncached_00000'] cairo_renderer.py:87 [09/04/23 15:46:27] INFO Animation 0 : Partial movie file written in '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/partial_movie_files/VoiceScene/uncached_00000.mp4' scene_file_writer.py:527 DEBUG Animation with empty mobject animation.py:174 INFO Caching disabled. cairo_renderer.py:68 DEBUG List of the first few animation hashes of the scene: ['uncached_00000', 'uncached_00001'] cairo_renderer.py:87 INFO Animation 1 : Partial movie file written in '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/partial_movie_files/VoiceScene/uncached_00001.mp4' scene_file_writer.py:527 DEBUG Animation with empty mobject animation.py:174 INFO Caching disabled. cairo_renderer.py:68 DEBUG List of the first few animation hashes of the scene: ['uncached_00000', 'uncached_00001', 'uncached_00002'] cairo_renderer.py:87 INFO Animation 2 : Partial movie file written in '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/partial_movie_files/VoiceScene/uncached_00002.mp4' scene_file_writer.py:527 INFO Combining to Movie file. scene_file_writer.py:617 DEBUG Partial movie files to combine (3 files): ['/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/partial_movie_files/VoiceScene/uncached_00000.mp4', scene_file_writer.py:561 '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/partial_movie_files/VoiceScene/uncached_00001.mp4', '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/partial_movie_files/VoiceScene/uncached_00002.mp4'] DEBUG Setting config.output_file: '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/VoiceScene.mp4' utils.py:336 INFO scene_file_writer.py:736 File ready at '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/VoiceScene.mp4' INFO Subcaption file has been written as /home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/VoiceScene.srt scene_file_writer.py:731 INFO Rendered VoiceScene scene.py:241 Played 3 animations INFO Previewed File at: '/home/mk/dev/manim/ecc/gist/media/videos/manim-voiceover-issue/480p15/VoiceScene.mp4' file_ops.py:227 kf.service.services: KApplicationTrader: mimeType "x-scheme-handler/file" not found (.venv-3.9) [mk@archlinux gist]$ VLC media player 3.0.18 Vetinari (revision 3.0.13-8-g41878ff4f2) ```

System specifications

System Details - OS Arch Linux, x86_64 Linux 6.4.12-arch1-1 - RAM: 32GB - Python version: Python 3.9.18: - Installed modules (provide output from `pip list`): ``` Package Version ------------------------------ ---------- absl-py 1.4.0 accelerate 0.22.0 aiohttp 3.8.5 aiosignal 1.3.1 anyascii 0.3.2 appdirs 1.4.4 async-timeout 4.0.3 attrs 23.1.0 audioread 3.0.0 azure-cognitiveservices-speech 1.31.0 Babel 2.12.1 bangla 0.0.2 blinker 1.6.2 bnnumerizer 0.0.2 bnunicodenormalizer 0.1.1 boltons 23.0.0 cachetools 5.3.1 certifi 2023.7.22 cffi 1.15.1 charset-normalizer 3.2.0 clean-fid 0.1.35 click 8.1.7 click-default-group 1.2.4 clip-anytorch 2.5.2 cloup 0.13.1 cmake 3.27.2 colour 0.1.5 contourpy 1.1.0 coqpit 0.0.17 cycler 0.11.0 Cython 0.29.30 dateparser 1.1.8 decorator 5.1.1 deepl 1.15.0 docker-pycreds 0.4.0 docopt 0.6.2 einops 0.6.1 encodec 0.1.1 evdev 1.6.1 ffmpeg-python 0.2.0 filelock 3.12.3 Flask 2.3.3 fonttools 4.42.1 frozenlist 1.4.0 fsspec 2023.9.0 ftfy 6.1.1 future 0.18.3 g2pkk 0.1.2 gitdb 4.0.10 GitPython 3.1.34 glcontext 2.4.0 google-auth 2.22.0 google-auth-oauthlib 1.0.0 grpcio 1.57.0 gruut 2.2.3 gruut-ipa 0.13.0 gruut-lang-de 2.0.0 gruut-lang-en 2.0.0 gruut-lang-es 2.0.0 gruut-lang-fr 2.0.2 gTTS 2.3.2 huggingface-hub 0.16.4 idna 3.4 imageio 2.31.3 importlib-metadata 6.8.0 importlib-resources 6.0.1 inflect 5.6.0 isosurfaces 0.1.0 itsdangerous 2.1.2 jamo 0.4.1 jieba 0.42.1 Jinja2 3.1.2 joblib 1.3.2 jsonlines 1.2.0 jsonmerge 1.9.2 jsonschema 4.19.0 jsonschema-specifications 2023.7.1 k-diffusion 0.0.16 kiwisolver 1.4.5 kornia 0.7.0 lazy_loader 0.3 librosa 0.10.0 lit 16.0.6 llvmlite 0.40.1 manim 0.17.3 manim-voiceover 0.3.4 ManimPango 0.4.3 mapbox-earcut 1.0.1 Markdown 3.4.4 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.7.2 mdurl 0.1.2 moderngl 5.8.2 moderngl-window 2.4.4 more-itertools 10.1.0 mpmath 1.3.0 msgpack 1.0.5 multidict 6.0.4 multipledispatch 1.0.0 mutagen 1.47.0 networkx 2.8.8 nltk 3.8.1 num2words 0.5.12 numba 0.57.0 numpy 1.22.0 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 oauthlib 3.2.2 openai-whisper 20230314 packaging 23.1 pandas 2.0.3 pathtools 0.1.2 Pillow 9.5.0 pip 23.2.1 platformdirs 3.10.0 pooch 1.7.0 protobuf 4.24.2 psutil 5.9.5 pyasn1 0.5.0 pyasn1-modules 0.3.0 PyAudio 0.2.13 pycairo 1.24.0 pycparser 2.21 pydub 0.25.1 pyglet 2.0.9 Pygments 2.16.1 pynndescent 0.5.10 pynput 1.7.6 pyparsing 3.0.9 pypinyin 0.49.0 pyrr 0.10.3 pysbd 0.3.4 python-crfsuite 0.9.9 python-dateutil 2.8.2 python-dotenv 0.21.1 python-slugify 8.0.1 python-xlib 0.33 pyttsx3 2.90 pytz 2023.3 PyWavelets 1.4.1 PyYAML 6.0.1 referencing 0.30.2 regex 2023.8.8 requests 2.31.0 requests-oauthlib 1.3.1 resize-right 0.0.2 rich 13.5.2 rpds-py 0.10.0 rsa 4.9 safetensors 0.3.3 scikit-image 0.21.0 scikit-learn 1.3.0 scipy 1.11.2 screeninfo 0.8.1 sentry-sdk 1.30.0 setproctitle 1.3.2 setuptools 59.8.0 six 1.16.0 skia-pathops 0.7.4 smmap 5.0.0 soundfile 0.12.1 sox 1.4.1 soxr 0.3.6 srt 3.5.3 stable-ts 2.9.0 svgelements 1.9.6 sympy 1.12 tensorboard 2.14.0 tensorboard-data-server 0.7.1 text-unidecode 1.3 threadpoolctl 3.2.0 tifffile 2023.8.30 tiktoken 0.3.1 tokenizers 0.13.3 torch 2.0.1 torchaudio 2.0.2 torchdiffeq 0.2.3 torchsde 0.2.5 torchvision 0.15.2 tqdm 4.66.1 trainer 0.0.31 trampoline 0.1.2 transformers 4.32.1 triton 2.0.0 TTS 0.16.5 typing_extensions 4.7.1 tzdata 2023.3 tzlocal 5.0.1 umap-learn 0.5.1 urllib3 1.26.16 wandb 0.15.9 watchdog 2.3.1 wcwidth 0.2.6 Werkzeug 2.3.7 wheel 0.41.0 yarl 1.9.2 zipp 3.16.2 ```
LaTeX details + LaTeX distribution (e.g. TeX Live 2020): + Installed LaTeX packages:
FFMPEG Output of `ffmpeg -version`: ``` built with gcc 13.1.1 (GCC) 20230429 configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libjxl --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-version3 --enable-vulkan libavutil 58. 2.100 / 58. 2.100 libavcodec 60. 3.100 / 60. 3.100 libavformat 60. 3.100 / 60. 3.100 libavdevice 60. 1.100 / 60. 1.100 libavfilter 9. 3.100 / 9. 3.100 libswscale 7. 1.100 / 7. 1.100 libswresample 4. 10.100 / 4. 10.100 libpostproc 57. 1.100 / 57. 1.100 ```

Additional comments

osolmaz commented 1 year ago

I merged #69.

BTW: I am wondering how config.output_file can safely be configured by the user, since it is simply overwritten by the code.

When would it need to be configured? I'm trying to understand the use case here