kadirnar / whisper-plus

WhisperPlus: Faster, Smarter, and More Capable 🚀
Apache License 2.0
1.7k stars 137 forks source link

ImageMagick is not installed on your computer BUT its installed. #91

Closed PiotrEsse closed 5 months ago

PiotrEsse commented 5 months ago

I am trying to run autocaption example. Ive run on this error. Imagemagick is in fact installed.

(WhisperPlus38) (base) piotr@Legion7:~/WhisperPlus/Tutorial$ /home/piotr/anaconda3/envs/WhisperPlus38/bin/python /home/piotr/WhisperPlus/Tutorial/autocaption.py
2024-05-07 12:08:18,308 - WARNING - /home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend("soundfile")

2024-05-07 12:08:18,390 - INFO - Loading model...
2024-05-07 12:08:19,331 - INFO - Model loaded successfully.
2024-05-07 12:08:19,331 - INFO - Using device: cuda
ffmpeg version 6.1.1-3ubuntu5 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13 (Ubuntu 13.2.0-23ubuntu3)
  configuration: --prefix=/usr --extra-version=3ubuntu5 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --disable-omx --enable-gnutls --enable-libaom --enable-libass --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libharfbuzz --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-openal --enable-opencl --enable-opengl --disable-sndio --enable-libvpl --disable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-ladspa --enable-libbluray --enable-libjack --enable-libpulse --enable-librabbitmq --enable-librist --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libx264 --enable-libzmq --enable-libzvbi --enable-lv2 --enable-sdl2 --enable-libplacebo --enable-librav1e --enable-pocketsphinx --enable-librsvg --enable-libjxl --enable-shared
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/piotr/WhisperPlus/Tutorial/Wiktor.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.20.100
  Duration: 00:02:27.75, start: 0.000000, bitrate: 15237 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1920x1072, 15036 kb/s, 24 fps, 24 tbr, 12288 tbn (default)
    Metadata:
      handler_name    : Core Media Video
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 196 kb/s (default)
    Metadata:
      handler_name    : Core Media Audio
      vendor_id       : [0][0][0][0]
Stream mapping:
  Stream #0:1 -> #0:0 (aac (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
Output #0, mp3, to '/home/piotr/WhisperPlus/Tutorial/Wiktor.mp3':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    TSSE            : Lavf60.16.100
  Stream #0:0(und): Audio: mp3, 44100 Hz, stereo, fltp (default)
    Metadata:
      handler_name    : Core Media Audio
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.31.102 libmp3lame
[out#0/mp3 @ 0x5617a2ec8140] video:0kB audio:2309kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.015059%
size=    2309kB time=00:02:27.69 bitrate= 128.1kbits/s speed=58.8x    
2024-05-07 12:08:22,008 - INFO - Transcribing audio...
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.
Traceback (most recent call last):
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/moviepy/video/VideoClip.py", line 1137, in __init__
    subprocess_call(cmd, logger=None)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/moviepy/tools.py", line 54, in subprocess_call
    raise IOError(err.decode('utf8'))
OSError: convert-im6.q16: attempt to perform an operation not allowed by the security policy `@/tmp/tmp1bxytlbo.txt' @ error/property.c/InterpretImageProperties/3771.
convert-im6.q16: label expected `@/tmp/tmp1bxytlbo.txt' @ error/annotate.c/GetMultilineTypeMetrics/782.
convert-im6.q16: no images defined `PNG32:/tmp/tmp10k7g0kv.png' @ error/convert.c/ConvertImageCommand/3234.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/piotr/WhisperPlus/Tutorial/autocaption.py", line 4, in <module>
    caption(video_path="/home/piotr/WhisperPlus/Tutorial/Wiktor.mp4", output_path="output.mp4", language="spanish")
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/whisperplus/pipelines/whisper_autocaption.py", line 88, in __call__
    return self.add_subtitles_to_video(video_path, result['chunks'], output_path)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/whisperplus/pipelines/whisper_autocaption.py", line 59, in add_subtitles_to_video
    txt_clip = TextClip(text, fontsize=24, color='white', bg_color='black', size=(max_width, None))
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/moviepy/video/VideoClip.py", line 1146, in __init__
    raise IOError(error)
OSError: MoviePy Error: creation of None failed because of the following error:

convert-im6.q16: attempt to perform an operation not allowed by the security policy `@/tmp/tmp1bxytlbo.txt' @ error/property.c/InterpretImageProperties/3771.
convert-im6.q16: label expected `@/tmp/tmp1bxytlbo.txt' @ error/annotate.c/GetMultilineTypeMetrics/782.
convert-im6.q16: no images defined `PNG32:/tmp/tmp10k7g0kv.png' @ error/convert.c/ConvertImageCommand/3234.
.

.This error can be due to the fact that ImageMagick is not installed on your computer, or (for Windows users) that you didn't specify the path to the ImageMagick binary in file conf.py, or that the path you specified is incorrect
(WhisperPlus38) (base) piotr@Legion7:~/WhisperPlus/Tutorial$ imagemagick
imagemagick: command not found
(WhisperPlus38) (base) piotr@Legion7:~/WhisperPlus/Tutorial$ sudo apt install imagemagick
[sudo] password for piotr: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
imagemagick is already the newest version (8:6.9.12.98+dfsg1-5.2build2).
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
(WhisperPlus38) (base) piotr@Legion7:~/WhisperPlus/Tutorial$ 
PiotrEsse commented 5 months ago

Ive install imagemagic from source by following this tutprial: https://www.tecmint.com/install-imagemagick-on-debian-ubuntu/ ImageMagick is in path.

(WhisperPlus38) piotr@Legion7:~/WhisperPlus/Tutorial$ magick -version
Version: ImageMagick 7.1.1-32 Q16-HDRI x86_64 22207 https://imagemagick.org
Copyright: (C) 1999 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI OpenMP(4.5)
Delegates (built-in):
Compiler: gcc (13.2)
(WhisperPlus38) piotr@Legion7:~/WhisperPlus/Tutorial$
kadirnar commented 5 months ago

I solved the error. I added it to the readme section. I tested it and it works.

image