ufal / udpipe

UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
Mozilla Public License 2.0
358 stars 75 forks source link

Compilation #24

Closed pmarcis closed 7 years ago

pmarcis commented 7 years ago

Hi, I compiled udpipe with g++ (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12)) and swig (SWIG Version 3.0.8) on Ubuntu 16.10 and it seems to fail when loading models with a segmentation fault:

./udpipe --tokenize --tag --parse ../../../models/en.model.output ../../../test_en.txt
Loading UDPipe model: Segmentation fault (core dumped)

The model was trained on a different computer with g++ (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)) and swig (SWIG Version 3.0.2) and Ubuntu 16.04.2 LTS.

I cannot nor load pre-trained, nor train new models.

Do you have an idea what could be the issue?

pmarcis commented 7 years ago

Just a follow-up finding - compiling with MODE=debug works. Also the python bindings work when compiling in debug mode. The release mode does not work.

foxik commented 7 years ago

Reproduced the segmentation fault on g++ 6.3 (current Debian Stretch), using the udpipe binary (no Python wrapper). Debugging.

foxik commented 7 years ago

It seems to be a vectorizer bug in g++, but it may also be the case that something is wrong with my code (i.e., some undefined behaviour as aliasing -- but I cannot find anything). Adding volatile stops vectorizer from kicking in.

Also note that you seem to be using master -- currently there is heavy development going on in master, including changing the serialization format for the new models. Therefore, I would recommend to use stable branch for model training.

pmarcis commented 7 years ago

Ok, thanks for the info! I will be relying on the stable branch in the future!