oseiskar / autosubsync

Automatically synchronize subtitles with audio using machine learning
MIT License
387 stars 35 forks source link

NotImplementedError: function is not implemented for this dtype: [how->mean,dtype->object] #17

Closed OnlyWick closed 1 year ago

OnlyWick commented 1 year ago
training accuracy 0.9027061016583384
bias 0 s
testing serialization in temp file /var/folders/hz/0wgmltk53x5573g799s1769h0000gn/T/tmpzm4prx0i/model.bin
Validating...
---- speech detection accuracy ----
Traceback (most recent call last):
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/groupby/groupby.py", line 1490, in array_func
    result = self.grouper._cython_operation(
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/groupby/ops.py", line 959, in _cython_operation
    return cy_op.cython_operation(
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/groupby/ops.py", line 657, in cython_operation
    return self._cython_op_ndim_compat(
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/groupby/ops.py", line 497, in _cython_op_ndim_compat
    return self._call_cython_op(
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/groupby/ops.py", line 541, in _call_cython_op
    func = self._get_cython_function(self.kind, self.how, values.dtype, is_numeric)
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/groupby/ops.py", line 173, in _get_cython_function
    raise NotImplementedError(
NotImplementedError: function is not implemented for this dtype: [how->mean,dtype->object]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/wick/Library/Python/3.9/lib/python/site-packages/pandas/core/nanops.py", line 1692, in _ensure_numeric
    x = float(x)
ValueError: could not convert string to float: 'enenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenenene....en'

All collected information can be loaded correctly. WeChat73974e56e5df4a76a6384828c8018bd1

WeChatc4db1568452ec1a16261ba48b996f9f0

A portion of index.csv.

sound,subtitles,language
sound_001.flac,subs_001.srt,en
sound_002.flac,subs_002.srt,en
sound_003.flac,subs_003.srt,en
sound_004.flac,subs_004.srt,en
sound_005.flac,subs_005.srt,en
sound_006.flac,subs_006.srt,en
sound_007.flac,subs_007.srt,en
sound_008.flac,subs_008.srt,en
sound_009.flac,subs_009.srt,en
sound_010.flac,subs_010.srt,en
sound_011.flac,subs_011.srt,en
sound_012.flac,subs_012.srt,en
sound_013.flac,subs_013.srt,en
sound_014.flac,subs_014.srt,en
sound_015.flac,subs_015.srt,en
sound_016.flac,subs_016.srt,en
sound_017.flac,subs_017.srt,en
sound_018.flac,subs_018.srt,en
sound_019.flac,subs_019.srt,en
sound_020.flac,subs_020.srt,en
sound_021.flac,subs_021.srt,en
sound_022.flac,subs_022.srt,en

A portion of meta.csv.

,label,file_number,language
0,0.0,1,en
1,0.0,1,en
2,0.0,1,en
3,0.0,1,en
4,0.0,1,en
5,0.0,1,en
6,0.0,1,en
7,0.0,1,en
8,0.0,1,en
9,0.0,1,en
10,1.0,1,en
11,1.0,1,en
12,1.0,1,en
13,1.0,1,en
14,1.0,1,en
15,1.0,1,en
16,1.0,1,en
17,1.0,1,en

Are there any requirements for the Python version? I am working on a Mac with an M2 chip.

OnlyWick commented 1 year ago

Problem solved. You need to see this: https://stackoverflow.com/questions/76233716/groupby-mean-not-working-on-titanic-dataset-in-python.

OnlyWick commented 1 year ago

@oseiskar I provided nearly 3 hours of video and subtitles for training, but the final result of the training is not effective.

estimated threshold: 0.45454545454545453
current threshold: 0.75
quality margin: -0.9090909090909091

The running result is as follows.

Extracting audio using ffmpeg and reading subtitles...
computing features for 9041920 audio samples using 3 parallel process(es)
extracted features of size (7534, 50), performing speech detection
computing best fit with 7534 frames
max shift 20s, test increments 0.05s
testing with skews: 23.976/25, 24/25, 23.976/24, 1/1, 24/23.976, 25/24, 25/23.976
bias 0.0
shift   score   quality skew
13.1    0.695   0.24    23.976/25
13.1    0.697   0.26    24/25
0.2     0.849   0.909   23.976/24
0       0.858   0.909   1/1
-0.2    0.848   0.909   24/23.976
-13.5   0.675   0       25/24
-13.7   0.676   0.282   25/23.976
optimal shift: 0 seconds, skew: 1/1
quality of fit: 0.909091, threshold 0.75
Fit complete. Performing resync, writing to /Users/wick/Desktop/snyc.srt
success!