ina-foss / inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
MIT License
717 stars 127 forks source link

sementation fault? #56

Closed ttjpleizier closed 3 years ago

ttjpleizier commented 3 years ago

System information

Tried to install both ways: pip install and building from source

Install worked. ina-speech-segmenter.py help produces the correct help text.

Building from source: python setup.py test failed; Pip install version: running a .mp3 failed with the same error message:

running test WARNING: Testing via this command is deprecated and will be removed in a future version. Users looking for a generic test entry point independent of test runner are encouraged to use tox. running egg_info writing inaSpeechSegmenter.egg-info/PKG-INFO writing dependency_links to inaSpeechSegmenter.egg-info/dependency_links.txt writing requirements to inaSpeechSegmenter.egg-info/requires.txt writing top-level names to inaSpeechSegmenter.egg-info/top_level.txt reading manifest file 'inaSpeechSegmenter.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching 'scripts' writing manifest file 'inaSpeechSegmenter.egg-info/SOURCES.txt' running build_ext 2021-03-27 16:26:01.207346: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2021-03-27 16:26:01.207367: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Segmentation fault (core dumped)

r-uro commented 3 years ago

Hi,

I have not been able to reproduce this error, but it seems you're using tensorflow-gpu to run on a CPU. It should fall back automatically and work on a CPU, but maybe you can try using tensorflow instead of tensorflow-gpu ?

ttjpleizier commented 3 years ago

Thanks for the response! I've tried both tensorflow and tensorflow-gpu, resulting in the same warning-message and finally the same 'segmentation fault'. Is it possible that there is a compatibility problem with Python 3.8.5 or with another pip package?

I was able to run the program in the Windows Linux Subsystem, but apparently the 'real' Linux doesn't process the program as it should.

ttjpleizier commented 3 years ago

Hi,

I did try the following:

  1. installing only tensorflow-cpu

  2. installing python3.7, creating a python3.7 virtualenv, installing tensorflow/inaSpeechSegmenter

Both scenarios again ran into 'segmentation fault, core dumped'.

Below I give output of pip list.

Thanks for your attention,

Regards,

Theo

pip list gives the following:

(inaSpeechEnv) @.***:~/github/ina-foss$ pip list Package                Version


absl-py                0.12.0 astunparse             1.6.3 cachetools             4.2.1 certifi                2020.12.5 cffi                   1.14.5 chardet                4.0.0 cycler                 0.10.0 decorator              4.4.2 docopt                 0.6.2 flatbuffers            1.12 gast                   0.3.3 google-auth            1.28.0 google-auth-oauthlib   0.4.4 google-pasta           0.2.0 grpcio                 1.32.0 h5py                   2.10.0 idna                   2.10 imageio                2.9.0 inaSpeechSegmenter     0.6.6 joblib                 1.0.1 Keras                  2.4.3 Keras-Preprocessing    1.1.2 kiwisolver             1.3.1 Markdown               3.3.4 matplotlib             3.4.1 munkres                1.1.4 networkx               2.5 numpy                  1.19.5 oauthlib               3.1.0 opt-einsum             3.3.0 pandas                 1.2.3 Pillow                 8.1.2 pip                    21.0.1 protobuf               3.15.6 pyannote.algorithms    0.8 pyannote.core          4.1 pyannote.parser        0.8 pyasn1                 0.4.8 pyasn1-modules         0.2.8 pycparser              2.20 pyparsing              2.4.7 Pyro4                  4.80 pytextgrid             0.1.4 python-dateutil        2.8.1 pytz                   2021.1 PyWavelets             1.1.1 PyYAML                 5.4.1 requests               2.25.1 requests-oauthlib      1.3.0 rsa                    4.7.2 scikit-image           0.18.1 scikit-learn           0.24.1 scipy                  1.6.2 serpent                1.30.2 setuptools             54.1.2 SIDEKIT                1.3.8.5.2 simplejson             3.17.2 six                    1.15.0 sortedcollections      2.1.0 sortedcontainers       2.3.0 SoundFile              0.10.3.post1 tensorboard            2.4.1 tensorboard-plugin-wit 1.8.0 tensorflow-cpu         2.4.1 tensorflow-estimator   2.4.0 termcolor              1.1.0 threadpoolctl          2.1.0 tifffile               2021.3.17 torch                  1.8.1 torchvision            0.9.1 tqdm                   4.59.0 typing-extensions      3.7.4.3 urllib3                1.26.4 Werkzeug               1.0.1 wheel                  0.36.2 wrapt                  1.12.1 xarray                 0.17.0

On 29-03-2021 10:33, r-uro wrote:

Hi,

I have not been able to reproduce this error, but it seems you're using |tensorflow-gpu| to run on a CPU. It should fall back automatically and work on a CPU, but maybe you can try using |tensorflow| instead of |tensorflow-gpu| ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ina-foss/inaSpeechSegmenter/issues/56#issuecomment-809185378, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU75T4YLKSFUNUY5BVJBTLTGBCPHANCNFSM4Z5APPWQ.

ttjpleizier commented 3 years ago

Still no solution found. Within the VirtualEnv I tried using gdb with the following command:

gdb --args python inaSpeechEnv/bin/ina_speech_segmenter.py -i testfiles/2020-09-27-09-20-024.mp3 -o testfiles/

It stops running with the message:

Thread 1 "python" received signal SIGSEGV, Segmentation fault. 0x000000000000000a in ?? ()

Next, I tried the bt-command within gdb, which resulted in the following:

201 0x000000000067d6db in PyRun_FileExFlags ()

202 0x000000000067da6e in PyRun_SimpleFileExFlags ()

203 0x00000000006b6132 in Py_RunMain ()

204 0x00000000006b64bd in Py_BytesMain ()

205 0x00007ffff7de10b3 in __libc_start_main (main=0x4eec80
, argc=6, argv=0x7fffffffdde8, init=, fini=,

rtld_fini=<optimized out>, stack_end=0x7fffffffddd8) at ../csu/libc-start.c:308

206 0x00000000005f927e in _start ()

(gdb) Quit

I do hope that someone has a suggestion to get inaSpeechSegmenter running on my Ubuntu 20.04, AMD Ryzen 5, 16GB RAM.

ttjpleizier commented 3 years ago

The issue has been solved.

The virtualenv did not work, a setup by using anaconda however, did.

Further, a new issue emerged: the program aborted after a few seconds and produced an 'Assertion error'.

Since inaSpeech has been updated 2 months ago, I found references to python 3.8 in the code. A new conda environment with 3.8 (and not 3.6) did work. Perhaps someone can look into this and perhaps update the Readme.md that python 3.8 is required?

Thanks for the program!

DavidDoukhan commented 3 years ago

Dear @ttjpleizier ,

I managed to reproduce your issue at it seems to be related to the update of one of the dependencies of inaSpeechSegmenter.

I did few fixes and managed to make it work again. It has been tested using python 3.6.9 and python 3.8.5. It is supposed to work well for any python >= 3.6 .

Could you clone the latest version available on github and let me know if your issue is now solved ?

I'm looking forward to hearing from you before updating the pip repository as well.

Kind regards, Sorry for the inconvenience, and thanks a lot for this feedback !

ttjpleizier commented 3 years ago

Dear @DavidDoukhan,

Thank you very much for the fix. I cloned the repo and ran python setup.py test. 14 tests in 22.572s. A test mp3 (1 hour of audio) was processed between 17:19:31.22 and 17:20:12.40. The fix was succesful!

Best, Theo

DavidDoukhan commented 3 years ago

Thanks for this fast feedback ! Kind regards,