kylebgorman / pynini

Read-only mirror of Pynini
http://pynini.opengrm.org
Apache License 2.0
120 stars 26 forks source link

setup stuck at cdrewrite #5

Closed funderburkjim closed 6 years ago

funderburkjim commented 6 years ago

HI. I'm interested in trying pynini, but I can't get it installed.

I'm running ubuntu trusty32 version via a vagrant/virtualbox configuration under windows 10. All the prerequisites, AFAIK, have been installed.

When 'sudo python setup.py install' is run, the compilations start. However, the compilation of cdrewrite.cc never finishes. The last thing I see in terminal is:

i686-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -
fPIC -I/usr/include/python2.7 -c src/pynini_cdrewrite.cc -o build/temp.linux-i686-
2.7/src/pynini_cdrewrite.o -std=c++11 -Wno-unused-function -Wno-unused-local-typedef -funsigned-
char
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ 
[enabled by default]

I'm not knowledgeable about these kind of complicated installations, so have no idea what might need to be done to finish the installation. What do you suggest?

kylebgorman commented 6 years ago

Hi there! Building pynini_cdrewrite.cc is one of the longest steps of the process, because it translates to a huge amount of machine code. However, I am able to do it with relatively weak computers (for instance, a $100 ARM laptop with just 4 GB of RAM) when given enough time---it takes maybe an hour. If you can wait a bit longer and it's not running out of RAM, I would just allow that. Since you're compiling in a virtual environment you may want to make sure your VM has enough computing resources for the compilation. Or if that's not feasible, you can transpile---compile on another machine with more juice.

Another possibility---and this is a bit of a hack, but will work for many people, is the following. I'm assuming you're using the most recent version of Pynini, which is 2.0.0:

What these lines say is that the cdrewrite function should work for log-weight arcs. I've actually never had any need to do this---you can just use tropical arcs and then convert to log arcs pynini.map, if desired, and it should make compilation go 3x as fast, at least for this file.

FYI this warning here not an error. It's just a (very minor) bug in how Python is configured on certain platforms; as I understand it it thinks that when compiling C++ code it should use the strict prototype warnings, which doesn't make sense for C++ (I don't even know what a strict prototype is):

cc1plus: warning: command line option ‘-Wstrict-prototypes’ is

valid for C/ObjC but not for C++ [enabled by default]

On Tue, Jul 31, 2018 at 11:51 PM funderburkjim notifications@github.com wrote:

HI. I'm interested in trying pynini, but I can't get it installed.

I'm running ubuntu trusty32 version via a vagrant/virtualbox configuration under windows 10. All the prerequisites, AFAIK, have been installed.

When 'sudo python setup.py install' is run, the compilations start. However, the compilation of cdrewrite.cc never finishes. The last thing I see in terminal is:

i686-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes - fPIC -I/usr/include/python2.7 -c src/pynini_cdrewrite.cc -o build/temp.linux-i686- 2.7/src/pynini_cdrewrite.o -std=c++11 -Wno-unused-function -Wno-unused-local-typedef -funsigned- char cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]

I'm not knowledgeable about these kind of complicated installations, so have no idea what might need to be done to finish the installation. What do you suggest?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kylebgorman/Pynini/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJuOe1a40thfEB8ixHGDdo9mGTsu8Y8ks5uMSXBgaJpZM4VpzRG .

funderburkjim commented 6 years ago

Thanks for the helpful comments. Based on them, I increased the VM memory to 2GB -- this proved to do the trick; setup.py install worked, and took about 10 minutes.

There was another obstacle that occurred with the setup.py test step.

ImportError: libre2.so.0: cannot open shared object file: No such file or directory

https://stackoverflow.com/questions/8323794/re2-library-loading provided a suggestion; namely I returned to the re2 folder and issued command sudo ldconfig.

Magically, now python setup.py test ran to completion with Ran 158 tests in 0.610s.

Also I can now import pynini and begin working with examples in your various tutorial materials on pynini and openfst.

Incidentally, my motivation is to understand how to apply something from your https://www.safaribooksonline.com/oriole/string-processing-with-pythons-pynini-edit-transducers tutorial.

Imagine that we have a query string and wish to find the string or strings in a 
fixed lexicon it is closest to, in terms of edit distance. 

I'd like to be able to apply that in context of a lexicon of about 250k Sanskrit words drawn from the headwords of various Sanskrit dictionaries.

kylebgorman commented 6 years ago

Sounds good, it should be well suited for that. I hope to write more about that kind of edit distance matching in the future ;)

I would guess that you would need at least 2 GB, if not 4 GB, to compile a large project like this.