TALP-UPC / FreeLing

FreeLing project source code
Other
252 stars 96 forks source link

Buil failure on macOS #112

Closed fxcoudert closed 3 years ago

fxcoudert commented 3 years ago

As part of rebuilding Freeling with the latest Boost (1.74.0) on macOS Homebrew (https://github.com/Homebrew/homebrew-core/pull/62549), we're building 4.2 from source with:

cmake .. -DCMAKE_C_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_INSTALL_PREFIX=/usr/local/Cellar/freeling/4.2_1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_FIND_FRAMEWORK=LAST -DCMAKE_VERBOSE_MAKEFILE=ON -Wno-dev -DCMAKE_OSX_SYSROOT=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk
make install

and we get this error:

-- -- Creating ca/balear dictionary...
file "/tmp/freeling-20201011-45239-1r2mni5/FreeLing-4.2/data/ca/balear/dictionary/header" not found
CMake Error at cmake_install.cmake:58 (message):
  multiword fusion failed with error: 255
lluisp commented 3 years ago

That file is in the repo, https://github.com/TALP-UPC/FreeLing/blob/v4.2/data/ca/balear/dictionary/header so it shuold be in the sources. Can you check that?

fxcoudert commented 3 years ago

No it is not in the released file https://github.com/TALP-UPC/FreeLing/releases/download/4.2/FreeLing-src-4.2.tar.gz:

Capture d’écran 2020-10-17 à 12 06 40

There is no data/ca subdirectory.

lluisp commented 3 years ago

Uhm, I see... you need to download also the language data file FreeLing-lang-src-4.2 After installing, you can delete folders from unneeded languages from /usr/local/share/freeling/data.

I'll fix the installation scripts to take that into account

thanks!

fxcoudert commented 3 years ago

Hum, this is not what the documentation says:

If you want to use another of the languages supported by FreeLing, dowload also the language file or FreeLing-langs-4.2.tar.gz or FreeLing-langs-4.2.zip and uncompres it

It seems a shame, for our building process, to have to download another 1 GB file of data and remove it later. This wasn't the case before, could it possible go back to the earlier (simpler) way?

lluisp commented 3 years ago

The new version has added neural models and word embeddings for several languages, which take a lot of space, and make the files go over github's maximum size for a release file. That is why the files are separated

The problem is that the installation script was not properly adjusted, and tries to install all languages. I will fix this as soon as I can

Meanwhile, there are two solutions: 1) The one I suggested before: dowload athe language packages and remove them afterwards 2) Alternatively, before running cmake, you can edit CMakeList.txt, locate lines 72-73, and remove unnneeded languages That is, these lines:

#### Data installation hooks
SET(languages "as;ca;cs;cy;de;en;es;fr;gl;hr;it;nb;pt;ru;sl")
SET(variants "es/es-old;es/es-ar;es/es-cl;ca/balear;ca/valencia")    

should be changed to :

#### Data installation hooks
SET(languages "en;es;pt")
SET(variants "es/es-old;es/es-ar;es/es-cl")    

If you don't want spanish variants, you can have SET(variants "") in line 73 If you don't want, say, portuguese, you can have SET(languages "en;es") in line 72

fxcoudert commented 3 years ago

Thanks for the workaround. A long-term fix would be good for distribution, allowing to build the default source package without messing with cmake files.

lluisp commented 3 years ago

Fixed on the master branch. Only present languages are installed. thanks