ashvardanian / StringZilla

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
https://ashvardanian.com/posts/stringzilla/
Apache License 2.0
2.05k stars 66 forks source link

C++ compatibility errors with stringzilla.h #84

Closed WillisMedwell closed 6 months ago

WillisMedwell commented 7 months ago

Using clang-cl this error is generated when including stringzilla.hpp.

C:/dev/utily/build-native/_deps/stringzilla-src/include\stringzilla/stringzilla.h:1946:50: warning: use of old-style cast [-Wold-style-cast]
    if (h_length < n_length || !n_length) return SZ_NULL;
                                                 ^~~~~~~
C:/dev/utily/build-native/_deps/stringzilla-src/include\stringzilla/stringzilla.h:1095:18: note: expanded from macro 'SZ_NULL'
#define SZ_NULL ((void *)0)
                 ^       ~
C:/dev/utily/build-native/_deps/stringzilla-src/include\stringzilla/stringzilla.h:1946:50: error: cannot initialize return object of type 'sz_cptr_t' (aka 'const char *') with an rvalue of type 'void *'
    if (h_length < n_length || !n_length) return SZ_NULL;
                                                 ^~~~~~~
C:/dev/utily/build-native/_deps/stringzilla-src/include\stringzilla/stringzilla.h:1095:17: note: expanded from macro 'SZ_NULL'
#define SZ_NULL ((void *)0)

To resolve this issue, I think you can wrap the include in file: stringzilla.hpp line 62: From

#include <stringzilla/stringzilla.h>

To

extern "C" {
     #include <stringzilla/stringzilla.h>
}

But I'm not 100%, probably worth some investigating.

In addition, fetch content section probably isn't verbose enough (especially considering the joy of working with cmake)

   # without this fetchcontent will fail with no indicative error message
   include(FetchContent)

    FetchContent_Declare(
        stringzilla 
        GIT_REPOSITORY https://github.com/ashvardanian/stringzilla.git 
        GIT_TAG main     # Need this or the operation fails
        GIT_SHALLOW FALSE
    )
    FetchContent_MakeAvailable(stringzilla)

    # Not sure how you intend for users to link/find your library as make_available 
    # doesn't add it to include path or anything. So, you have to do something like this.
    target_link_libraries(my_proj PRIVATE stringzilla)
ashvardanian commented 7 months ago

Thanks for suggestions, @WillisMedwell! Both sound reasonable! Can you please open a PR into main-dev?

WillisMedwell commented 7 months ago

It's a bit difficult to debug when there are no example projects to validate the changes, would you consider adding an examples folder with different language projects? Due to the many supported languages, it seems difficult at the moment to validate each one.

You could then run an action to validate each of them, and it also provides users with examples of how to integrate with their python, c++, and other projects.

I would be happy to do this for C++ if its something that aligns with your vision of the repo?

ashvardanian commented 7 months ago

Not sure what you mean, @WillisMedwell. Have you checked the CONTRIBUTING.md and the scripts folder? In VSCode you will also get tasks and launchers preconfigured once you open the project.

WillisMedwell commented 7 months ago

@ashvardanian my bad i didnt realise the testing suite was under ./scripts.

But my point being that you could have an ./example folder with different ways of installing the lib for c++, python ect. Then you could validate, demonstrate & scope the intended way of using the lib. I only suggest this as 'zilla supports many different languages and it might be difficult to know when stuff breaks.

ashvardanian commented 7 months ago

@WillisMedwell, there is a CI pipeline on GitHub. We don't merge until the tests for all languages pass and environments pass.

Regarding a separate folder for examples, I try to avoid branching the directory tree too much. Cause then I'll have to make separate folders for unit tests, fuzzy tests, stress tests, benchmarks... across all languages.

Hope this makes sense 🤗