Peter-obi / Video_summarization_mlx

Transcribe and summarize videos using whisper and llms on apple mlx framework
MIT License
70 stars 6 forks source link

libc++abi: terminating due to uncaught exception of type std::runtime_error: [AddMM::eval_cpu] Currently only supports float32. Abort trap: 6 #1

Open joshuachen3333 opened 9 months ago

joshuachen3333 commented 9 months ago

In my MacBook Pro (intel i5 cpu 16GB ram), with the following config, I did my test like this:

MacOS 14.3 Xcode 15.2 kMDItemVersion = "2.8.5" Apple clang version 15.0.0 (clang-1500.1.0.2.5) Python 3.11.7

conda create -n video_summarize_mlx python=3.11
conda activate video_summarize_mlx
git clone https://github.com/Peter-obi/Video_summarization_mlx
cd Video_summarization_mlx
# mlx
# mlx-lm

because they can not be installed by pip

pip install mlx
ERROR: Could not find a version that satisfies the requirement mlx (from versions: none)
ERROR: No matching distribution found for mlx

and to run the code like this:

python -m spacy download en_core_web_sm
python main.py --input_path "/Users/joshua/video/test1.mp4" --title "test1"
Peter-obi commented 9 months ago

MLX is an array framework for machine learning on Apple silicon ie M chips (read more here: https://github.com/ml-explore/mlx). Seems you have an Intel chip. First thoughts: you can swap out the whisper part for a compatible whisper one and then change the mlx model portions to 'normal' huggingface models.

az-boromir commented 9 months ago

Similar issue for me on an M2 Mac Air.

Transcribing files/audio/the_vision.wav (this may take a while)... libc++abi: terminating due to uncaught exception of type std::runtime_error: [AddMM::eval_cpu] Currently only supports float32. zsh: abort python main.py --input_path "The Vision.mp4" --title "Test" (video_summarize_mlx) Video_summarization_mlx % /video_summarize_mlx/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Peter-obi commented 9 months ago

I will try to recreate the issue when I get back on my system. Some things you can try, make sure you are not running out of memory (happened once when I used a bigger model), also there are discussions on this here: https://github.com/conda/conda/issues/9589, https://github.com/apple/ml-stable-diffusion/issues/8, try also summarize_with_mlx instead of summarize_in_parallel to see if that helps (if it is a memory issue).

az-boromir commented 9 months ago

it is decode_result = model.decode(segment, options) inside decode_with_fallback that is breaking, will continue seeing if I can find out what the issue is.

printing out the segment shows: array([[-0.39502, -0.39502, -0.39502, ..., -0.39502, -0.39502, -0.39502], [-0.39502, -0.39502, -0.39502, ..., -0.39502, -0.39502, -0.39502], [-0.39502, -0.39502, -0.39502, ..., -0.39502, -0.39502, -0.39502], ..., [0.575195, 0.70752, 0.590332, ..., -0.39502, -0.39502, -0.39502], [0.603516, 0.728516, 0.655762, ..., -0.39502, -0.39502, -0.39502], [0.599609, 0.740723, 0.674316, ..., -0.39502, -0.39502, -0.39502]], dtype=float16)

and the error is: libc++abi: terminating due to uncaught exception of type std::runtime_error: [AddMM::eval_cpu] Currently only supports float32.

could it just be a data type mismatch?

Peter-obi commented 9 months ago

Interesting. I have never faced this particular issue, because I just use the already converted whisper from mlx examples and haven't ran into issues yet. Can you add this and see what happens: mel_segment = mel_segment.astype(mx.float32)?

joshuachen3333 commented 9 months ago

Could this be apple silicon M1 M2 only issue? I did it at a intel i5 platform. If yes , could I run it at a Linux i7 Nvidia env? How difficult is it to port to Linux x86_ 64 env?

az-boromir commented 9 months ago

Interesting. I have never faced this particular issue, because I just use the already converted whisper from mlx examples and haven't ran into issues yet. Can you add this and see what happens: mel_segment = mel_segment.astype(mx.float32)?

This didnt work. I changed the transcribe call to def to putting fp16 to false (def transcribe(audio_file, fp16=False, output_path="files/transcripts"):), and that issue is fixed, however I get a similar error when it is doing the summarize_with_mlx. I have the 4bit mixtral.

Audio has been transcribed in 22 seconds Found 1 chunks. Summarizing using MLX model... Generating summary with MLX model... libc++abi: terminating due to uncaught exception of type std::runtime_error: [Matmul::eval_cpu] Currently only supports float32. zsh: abort python main.py --input_path --title "Test" (video_summarize_mlx) miniconda/envs/video_summarize_mlx/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Looking at this, I may be missing something though. The 4bit model shouldnt need a float 32 I think.

Peter-obi commented 8 months ago

Great that you solved that! Been busy for a couple of weeks. Have you solved the resource_tracker error? I reproduced the error but only at the Whisper level and it was always because I ran a big model or a very long input, and it was always solved by using a smaller model or breaking the input into smaller chunks.