CLD2Owners / cld2

Compact Language Detector 2
Apache License 2.0
843 stars 128 forks source link

Compilation issues in Visual Studio #31

Open ghost opened 9 years ago

ghost commented 9 years ago

Originally reported on Google Code with ID 31

I am trying to compile the chromium in Visual Studio 2013. I am actually trying to create
a .NET Wrapper for the library so I have added all the source files inside my CLR project.

Now whenever I compile I get these linking errors.

    error LNK2005: "struct CLD2::CLD2TableSummary const CLD2::kCjkDeltaBi_obj" (?kCjkDeltaBi_obj@CLD2@@3UCLD2TableSummary@1@B)
already defined in cld_generated_cjk_delta_bi_32.obj

These all seems to be related as I can see a relation between the 'generated' files.

Problem is I have a lot of these and I am not sure which ones I should exclude and
which I should keep and use in my code.

Here is a list all the generated files that came with the CLD2 code.

    cld_generated_cjk_uni_prop_80.cc
    cld_generated_score_quad_octa_2.cc
    cld_generated_score_quad_octa_0122.cc
    cld_generated_score_quad_octa_0122_2.cc
    cld_generated_score_quad_octa_1024_256.cc
    cld_generated_cjk_delta_bi_4.cc
    cld_generated_cjk_delta_bi_32.cc
    cld2_generated_octa2_dummy.cc
    cld2_generated_quad0122.cc
    cld2_generated_quad0720.cc
    cld2_generated_quadchrome_2.cc
    cld2_generated_quadchrome_16.cc
    cld2_generated_cjk_compatible.cc
    cld2_generated_deltaocta0122.cc
    cld2_generated_deltaocta0527.cc
    cld2_generated_deltaoctachrome.cc
    cld2_generated_distinctocta0122.cc
    cld2_generated_distinctocta0527.cc
    cld2_generated_distinctoctachrome.cc

The naming convention of these suggests that I should only be using one of each group.
At least that how I think I should use it as I am not really an expert in encoding
nor in how CLD2 works. And I could not find any references online explaining how to
configure it.

I tried eliminating the linking errors by keeping only one of each generated group:

for example: from `cld_generated_cjk_delta_bi_4` and `cld_generated_cjk_delta_bi_32`
I kept the 32 version. And so on for the rest of the files.

Now this made CLD compile yet when I tried testing it with languages I noticed that
the scores were way way off and it was behaving inexplicably bad.

I am not trying to support all languages I only need to support latin languages along
with hebrew, arabic, japanese and chinese.

Can someone please explain how to configure CLD2 to compile and work correctly.

Reported by redserpent7 on 2015-03-30 05:57:39

jasonriesa commented 9 years ago

Hi @redserpent7, sorry for the delay in response; is this still an issue for you? See the table on this page: https://github.com/CLD2Owners/cld2/wiki/CLD2-Full-Version

Some of these generated files are for the "subset version" and some for the "full version" of CLD2. I recommend using the "full version" data if possible. When doing so, I believe you can avoid included data from the "subset version" which may have overlapping definitions.

redserpent7 commented 9 years ago

Hi @jasonriesa thanks for the reply. Actually since I did not receive a reply I thought CLD is dead so I switched to TextCat. I might give CLD another go now. However, could you explain the differences between full and subset versions?