monatis / clip.cpp

CLIP inference in plain C/C++ with no extra dependencies
MIT License
439 stars 30 forks source link

Fix segmentation fault in clip_image_batch_encode when batch_size exceeds n_threads #95

Closed A-ASTON closed 2 months ago

A-ASTON commented 2 months ago

Thanks for your great work!

When invoking the clip_image_batch_encode function to process my own data, setting batch_size to be greater than n_threads resulted in a segmentation fault.

This is caused by the clip_image_batch_preprocess not handling the data correctly.

Upon debugging with gdb, the issue was pinpointed to line 852 in clip.cpp:

pthread_create(&threads[t], NULL, preprocess_image, static_cast<void *>(&imageData[start_index]));

And the function:

void * preprocess_image(void * arg) {
    ImageData * imageData = static_cast<ImageData *>(arg);
    const clip_image_u8 * input = imageData->input;
    clip_image_f32 * resized = imageData->resized;
    const clip_ctx * ctx = imageData->ctx;

    // Call the original preprocess function on the image
    clip_image_preprocess(ctx, input, resized);

    pthread_exit(NULL);
}

At line 852, imageData at the start_index is passed to the preprocess_image function. However, within preprocess_image, only the logic for calling clip_image_preprocess on the first address of the passed imageData is implemented. This results in subsequent imageData in the same batch not being processed, leading to imageData->resized, i.e., clip_image_f32, having an empty data array, which causes a segmentation fault when clip_image_batch_encode is called.

I have addressed this issue, and the details can be seen in the code comparison.