monatis / clip.cpp

CLIP inference in plain C/C++ with no extra dependencies
MIT License
439 stars 30 forks source link

Make normalization optional to fix zsl #56

Closed monatis closed 1 year ago

monatis commented 1 year ago

closes #54

Since we are already changing function signatures, this might be a good opportunity to use C-compatible signatures everywhere and remove functions with _c suffix. WDYT?

denis-ismailaj commented 1 year ago

Since we are already changing function signatures, this might be a good opportunity to use C-compatible signatures everywhere and remove functions with _c suffix. WDYT?

I think it's better to keep C++ signatures so that it feels natural for users that would like to call the library from C++ code, however we can use preprocessor directives to only expose C-compatible functions when used in C so that we can rename them to have the same names in both cases.

Meaning we go from

#ifdef __cplusplus
bool clip_text_encode(const clip_ctx * ctx, int n_threads, const std::vector<clip_vocab::id> & tokens, float * vec);
#endif
bool clip_text_encode_c(const struct clip_ctx * ctx, int n_threads, const struct clip_tokens * tokens, float * vec);

to

#ifdef __cplusplus
bool clip_text_encode(const clip_ctx * ctx, int n_threads, const std::vector<clip_vocab::id> & tokens, float * vec);
#else
bool clip_text_encode(const struct clip_ctx * ctx, int n_threads, const struct clip_tokens * tokens, float * vec);
#endif

EDIT: A con of this is that it would require doing #ifdefs in the implementation as well, so maybe it's not worth it.

monatis commented 1 year ago

we can use preprocessor directives to only expose C-compatible functions when used in C so that we can rename them to have the same names in both cases

I think functions with the same name with different signatures would make it too confusing. I'd prefer to either keep both versions or expose only C-compatible functions without _c suffix, simplifying the API.

denis-ismailaj commented 1 year ago

I think functions with the same name with different signatures would make it too confusing.

Yeah, that's a valid concern.

expose only C-compatible functions without _c suffix, simplifying the API

In that case, I would recommend making the entire public API in clip.h C-compatible to streamline it. This would mean converting some struct fields to simpler types and "hiding" the other structs which don't need to be public in clip.cpp. This seems to be the approach llama.cpp has taken as well.

I can open a PR to take a stab at this if that's fine with you.

monatis commented 1 year ago

I would recommend making the entire public API in clip.h C-compatible to streamline it.

Yeah, this is what I'm also thinking of.

I can open a PR to take a stab at this if that's fine with you.

That would be awesome. You can target this PR (or directly push to it) and then we can merge to the main branch at once.

Thanks for the colaboration!