Open FurkanGozukara opened 9 months ago
I think something is very wrong
video attributes
General Complete name : C:\temp\a.mp4 Format : MPEG-4 Format profile : Base Media Codec ID : isom (isom/iso4) File size : 20.4 MiB Duration : 7 min 51 s Overall bit rate mode : Variable Overall bit rate : 362 kb/s Encoded date : UTC 2024-01-14 01:25:10 Tagged date : UTC 2024-01-14 01:25:10 Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : High@L3.1 Format settings : CABAC / 5 Ref Frames Format settings, CABAC : Yes Format settings, Reference fra : 5 frames Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 7 min 51 s Bit rate : 227 kb/s Maximum bit rate : 470 kb/s Width : 576 pixels Height : 1 024 pixels Display aspect ratio : 0.563 Frame rate mode : Constant Frame rate : 30.000 FPS Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Progressive Bits/(Pixel*Frame) : 0.013 Stream size : 12.7 MiB (63%) Title : Twitter-vork muxer Writing library : x264 core 164 r3095 baee400 Encoding settings : cabac=1 / ref=5 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=2 / psy=0 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=4 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / stitchable=1 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=infinite / keyint_min=30 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=28.0 / qcomp=0.60 / qpmin=10 / qpmax=69 / qpstep=4 / vbv_maxrate=2048 / vbv_bufsize=2048 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / aq=2:1.00 Tagged date : UTC 2024-01-14 01:25:10 Codec configuration box : avcC Audio ID : 2 Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Codec ID : mp4a-40-2 Duration : 7 min 51 s Bit rate mode : Variable Bit rate : 128 kb/s Maximum bit rate : 137 kb/s Channel(s) : 2 channels Channel layout : L R Sampling rate : 44.1 kHz Frame rate : 43.066 FPS (1024 SPF) Compression mode : Lossy Stream size : 7.19 MiB (35%) Title : Twitter-vork muxer Default : Yes Alternate group : 1 Tagged date : UTC 2024-01-14 01:25:10
And here the venv attributes
Microsoft Windows [Version 10.0.19045.3930] (c) Microsoft Corporation. All rights reserved. C:\temp\caption\autocaption\venv\Scripts>activate (venv) C:\temp\caption\autocaption\venv\Scripts>pip freeze altair==5.2.0 annotated-types==0.6.0 attrs==23.2.0 av==10.0.0 beautifulsoup4==4.12.2 blinker==1.7.0 blis==0.7.11 cachetools==5.3.2 catalogue==2.0.10 certifi==2023.11.17 chardet==3.0.4 charset-normalizer==3.3.2 click==8.1.7 cloudpathlib==0.16.0 colorama==0.4.6 coloredlogs==15.0.1 confection==0.1.4 contourpy==1.2.0 ctranslate2==3.24.0 cycler==0.12.1 cymem==2.0.8 decorator==4.4.2 Faker==22.2.0 faster-whisper==0.7.0 favicon==0.7.0 ffmpeg==1.4 ffmpeg-python==0.2.0 filelock==3.13.1 flatbuffers==23.5.26 fonttools==4.47.2 fsspec==2023.12.2 future==0.18.3 gitdb==4.0.11 GitPython==3.1.41 googletrans==3.1.0a0 h11==0.9.0 h2==3.2.0 hpack==3.0.0 hstspreload==2024.1.5 htbuilder==0.6.1 httpcore==0.9.1 httpx==0.13.3 huggingface-hub==0.20.2 humanfriendly==10.0 hyperframe==5.2.0 idna==2.10 imageio==2.33.1 imageio-ffmpeg==0.4.9 importlib-metadata==6.11.0 Jinja2==3.1.3 joblib==1.3.2 jsonschema==4.20.0 jsonschema-specifications==2023.12.1 kiwisolver==1.4.5 langcodes==3.3.0 llvmlite==0.41.1 lxml==5.1.0 Markdown==3.5.2 markdown-it-py==3.0.0 markdownlit==0.0.7 MarkupSafe==2.1.3 matplotlib==3.8.2 mdurl==0.1.2 more-itertools==10.2.0 moviepy==2.0.0.dev2 mpmath==1.3.0 murmurhash==1.0.10 networkx==3.2.1 nltk==3.8.1 numba==0.58.1 numpy==1.26.3 onnxruntime==1.16.3 openai-whisper==20230314 packaging==23.2 pandas==2.0.3 Pillow==9.5.0 preshed==3.0.9 proglog==0.1.10 protobuf==4.25.2 pyarrow==14.0.2 pydantic==2.5.3 pydantic_core==2.14.6 pydeck==0.8.0 pydub==0.25.1 Pygments==2.17.2 pymdown-extensions==10.7 Pympler==1.0.1 pyparsing==3.1.1 pyreadline3==3.4.1 python-dateutil==2.8.2 pytz==2023.3.post1 pytz-deprecation-shim==0.1.0.post0 PyYAML==6.0.1 referencing==0.32.1 regex==2023.12.25 requests==2.31.0 rfc3986==1.5.0 rich==13.7.0 rpds-py==0.17.1 six==1.16.0 smart-open==6.4.0 smmap==5.0.1 sniffio==1.3.0 soupsieve==2.5 spacy==3.7.2 spacy-legacy==3.0.12 spacy-loggers==1.0.5 spacytextblob==4.0.0 SpeechRecognition==3.10.0 srsly==2.4.8 st-annotated-text==4.0.1 streamlit==1.25.0 streamlit-camera-input-live==0.2.0 streamlit-card==1.0.0 streamlit-embedcode==0.1.2 streamlit-extras==0.3.0 streamlit-faker==0.0.3 streamlit-image-coordinates==0.1.6 streamlit-keyup==0.2.2 streamlit-toggle-switch==1.0.2 streamlit-vertical-slider==2.5.5 sympy==1.12 tenacity==8.2.3 textblob==0.15.3 thinc==8.2.2 tiktoken==0.3.1 tokenizers==0.13.3 toml==0.10.2 toolz==0.12.0 torch==2.1.2 tornado==6.4 tqdm==4.66.1 typer==0.9.0 typing_extensions==4.9.0 tzdata==2023.4 tzlocal==4.3.1 urllib3==2.1.0 validators==0.22.0 wasabi==1.1.2 watchdog==3.0.0 weasel==0.3.4 zipp==3.17.0 (venv) C:\temp\caption\autocaption\venv\Scripts>
it seems you're not using GPU but CPU.
I manually installed Pytorch CUDA version too. Didn't make change.
I think something is very wrong
video attributes
And here the venv attributes