ManimCommunity / manim-voiceover

Manim plugin for all things voiceover
https://voiceover.manim.community/en/stable
MIT License
171 stars 23 forks source link

I can't see start stream after press 'r' key and can't record my sound normally #81

Open semikernel opened 8 months ago

semikernel commented 8 months ago

Description of bug / unexpected behavior

I try to use RecorderService of the manim-voiceover on my Ubuntu22.04 OS Huawei Computer. After installation, I try to test it with the test code in the tutorial. However, it didn't work well.

Expected behavior

Then I watched the demostration video and found that I didn't get the same output. input: manim -pql recording.py --disable_caching

my output looks like:

Manim Community v0.18.0

/home/semikernel/anaconda3/envs/manim/lib/python3.11/site-packages/whisper/timing.py:57: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib confmisc.c:160:(snd_config_get_card) Invalid field card
ALSA lib pcm_usb_stream.c:482:(_snd_pcm_usb_stream_open) Invalid card 'card'
ALSA lib confmisc.c:160:(snd_config_get_card) Invalid field card
ALSA lib pcm_usb_stream.c:482:(_snd_pcm_usb_stream_open) Invalid card 'card'
-------------------------device list-------------------------
Input Device id  0  -  sof-hda-dsp: - (hw:0,0)
Input Device id  4  -  sof-hda-dsp: - (hw:0,6)
Input Device id  5  -  sof-hda-dsp: - (hw:0,7)
Input Device id  6  -  sysdefault
Input Device id  7  -  samplerate
Input Device id  8  -  speexrate
Input Device id  9  -  pulse
Input Device id  10  -  upmix
Input Device id  11  -  vdownmix
Input Device id  13  -  default
-------------------------------------------------------------
Please select an input device id to record from:
5
Selected device: sof-hda-dsp: - (hw:0,7)
╔══════════════════════════════════╗
║ Voiceover:                       ║
║                                  ║
║ This circle is drawn as I speak. ║
╚══════════════════════════════════╝
Press and hold the 'r' key to begin recording
Wait for 1 second, then start speaking.
Wait for at least 1 second after you finish speaking.
This is to eliminate any sounds that may come from your keyboard.
The silence at the beginning and end will be trimmed automatically.
You can adjust this setting using the `trim_silence_threshold` argument.
These instructions are only shown once.
Release the 'r' key to end recording
rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr(I kept pressing 'r')

How to reproduce the issue

my testing code:

Code for reproducing the problem ```py from manim import * from manim_voiceover import VoiceoverScene from manim_voiceover.services.recorder import RecorderService # Simply inherit from VoiceoverScene instead of Scene to get all the # voiceover functionality. class RecorderExample(VoiceoverScene): def construct(self): # You can choose from a multitude of TTS services, # or in this example, record your own voice: self.set_speech_service(RecorderService()) circle = Circle() # Surround animation sections with with-statements: with self.voiceover(text="This circle is drawn as I speak.") as tracker: self.play(Create(circle), run_time=tracker.duration) # The duration of the animation is received from the audio file # and passed to the tracker automatically. # This part will not start playing until the previous voiceover is finished. with self.voiceover(text="Let's shift it to the left 2 units.") as tracker: self.play(circle.animate.shift(2 * LEFT), run_time=tracker.duration) ```

Additional media files

Images/GIFs

Logs

Terminal output ``` PASTE HERE OR PROVIDE LINK TO https://pastebin.com/ OR SIMILAR ```

System specifications

System Details - OS Ubuntu22.04.3 LTS - RAM:16GB - Python version Python 3.11.7 - Installed modules (provide output from `pip list`): ``` Package Version ------------------------------ ----------- azure-cognitiveservices-speech 1.34.1 Brotli 1.1.0 build 1.0.3 CacheControl 0.13.1 certifi 2023.11.17 cffi 1.16.0 charset-normalizer 3.3.2 cleo 2.1.0 click 8.1.7 click-default-group 1.2.4 cloup 2.1.2 cmake 3.28.1 colorama 0.4.6 crashtest 0.4.1 cryptography 42.0.1 decorator 5.1.1 deepl 1.16.1 distlib 0.3.8 dulwich 0.21.7 evdev 1.6.1 fastjsonschema 2.19.1 ffmpeg-python 0.2.0 filelock 3.13.1 fsspec 2023.12.2 future 0.18.3 glcontext 2.5.0 gTTS 2.5.0 huggingface-hub 0.20.3 idna 3.6 importlib-metadata 7.0.1 installer 0.7.0 isosurfaces 0.1.0 jaraco.classes 3.3.0 jeepney 0.8.0 Jinja2 3.1.3 keyring 24.3.0 lit 17.0.6 llvmlite 0.41.1 manim 0.18.0 manim-voiceover 0.3.4.post1 ManimPango 0.5.0 mapbox-earcut 1.0.1 markdown-it-py 3.0.0 MarkupSafe 2.1.4 mdurl 0.1.2 moderngl 5.9.0 moderngl-window 2.4.1 more-itertools 10.2.0 mpmath 1.3.0 msgpack 1.0.7 multipledispatch 0.6.0 mutagen 1.47.0 networkx 3.2.1 numba 0.58.1 numpy 1.26.3 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 openai-whisper 20230314 packaging 23.2 pexpect 4.9.0 Pillow 9.5.0 pip 23.3.2 pkginfo 1.9.6 platformdirs 3.11.0 poetry 1.7.1 poetry-core 1.8.1 poetry-plugin-export 1.6.0 ptyprocess 0.7.0 PyAudio 0.2.14 pycairo 1.25.1 pycparser 2.21 pydub 0.25.1 pyglet 1.5.27 Pygments 2.17.2 pynput 1.7.6 pyproject_hooks 1.0.0 pyrr 0.10.3 PySocks 1.7.1 python-dotenv 0.21.1 python-slugify 8.0.2 python-xlib 0.33 pyttsx3 2.90 PyYAML 6.0.1 rapidfuzz 3.6.1 regex 2023.12.25 requests 2.31.0 requests-toolbelt 1.0.0 rich 13.7.0 safetensors 0.4.2 scipy 1.12.0 screeninfo 0.8.1 SecretStorage 3.3.3 setuptools 69.0.3 shellingham 1.5.4 six 1.16.0 skia-pathops 0.8.0.post1 sox 1.4.1 srt 3.5.3 stable-ts 2.11.1 svgelements 1.9.6 sympy 1.12 text-unidecode 1.3 tiktoken 0.3.1 tokenizers 0.15.1 tomli 2.0.1 tomlkit 0.12.3 torch 2.0.1 torchaudio 2.0.2 tqdm 4.66.1 transformers 4.37.1 triton 2.0.0 trove-classifiers 2024.1.8 typing_extensions 4.9.0 urllib3 2.1.0 virtualenv 20.25.0 watchdog 2.3.1 wheel 0.42.0 zipp 3.17.0 ```
LaTeX details + LaTeX distribution (e.g. TeX Live 2020): + Installed LaTeX packages:
FFMPEG Output of `ffmpeg -version`: ``` PASTE HERE ```

Additional comments