fbkarsdorp / homebrew-lamachine

Brew formulas for installing NLP software developed by the Language Machines research group
5 stars 1 forks source link

XML2 not found #4

Closed evanmiltenburg closed 6 years ago

evanmiltenburg commented 6 years ago

I get this output:

Last 15 lines from /Users/Emiel/Library/Logs/Homebrew/ticcutils/01.configure:
checking whether the Boost::Regex library is available... yes
checking for exit in -lboost_regex-mt... yes
checking for pkg-config... /usr/local/opt/pkg-config/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for XML2... no
configure: error: Package requirements (libxml-2.0 >= 2.6.16 ) were not met:

No package 'libxml-2.0' found

Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.

Alternatively, you may set the environment variables XML2_CFLAGS
and XML2_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details.

This seems to be a known issue. I'm not sure what is the best way to resolve it in the instructions for installing LaMachine.

fbkarsdorp commented 6 years ago

Thanks. Will look into this. In the mean time, I added some of the previously brewed bottles that can be poured. Could you update and try again?

evanmiltenburg commented 6 years ago

The brew info output seems useful:

Emiel$ brew info libxml2
libxml2: stable 2.9.3, HEAD [keg-only]
GNOME XML library
http://xmlsoft.org
/usr/local/Cellar/libxml2/2.9.2 (275 files, 10M)
  Poured from bottle
/usr/local/Cellar/libxml2/2.9.3 (275 files, 9.7M)
  Built from source
From: https://github.com/Homebrew/homebrew/blob/master/Library/Formula/libxml2.rb
==> Options
--universal
    Build a universal binary
--with-python
    Build with python support
--HEAD
    Install HEAD version
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local.

OS X already provides this software and installing another version in
parallel can cause all kinds of trouble.

Generally there are no consequences of this for you. If you build your
own software and it requires this formula, you'll need to add to your
build variables:

    LDFLAGS:  -L/usr/local/opt/libxml2/lib
    CPPFLAGS: -I/usr/local/opt/libxml2/include
evanmiltenburg commented 6 years ago

How do I update? (I don't use brew very much.)

fbkarsdorp commented 6 years ago

Simply do brew update and it will update the tap.

evanmiltenburg commented 6 years ago

Didn't work for me.

I couldn't update the tap because I have an older brew version that was installed before OS X changed /usr/local to not be writable anymore. So I untapped and then tapped again instead.

fbkarsdorp commented 6 years ago

Hi @evanmiltenburg I made several changes to the formulas, and updated to the latest releases of the software. libxml2 should be an automatic dependency now. Would you care to try again?

evanmiltenburg commented 6 years ago

Installing seems to work. Here's the full log:

Emiels-MBP:Dutch Emiel$ brew install ucto
Warning: You are using OS X 10.13.
We do not provide support for this pre-release version.
You may encounter build failures or other breakages.
==> Installing ucto from fbkarsdorp/lamachine
==> Installing dependencies for fbkarsdorp/lamachine/ucto: libtextcat, bzip2, libxml2, zlib, automake, libtar, ticcutils, libfolia
==> Installing fbkarsdorp/lamachine/ucto dependency: libtextcat
==> Downloading http://software.wise-guys.nl/download/libtextcat-2.2.tar.gz
==> Downloading from https://software.wise-guys.nl/download/libtextcat-2.2.tar.gz
######################################################################## 100.0%
==> ./configure --disable-silent-rules --prefix=/usr/local/Cellar/libtextcat/2.2
==> make install
🍺  /usr/local/Cellar/libtextcat/2.2: 174 files, 730.7K, built in 22 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: bzip2
==> Downloading http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz
######################################################################## 100.0%
==> make install PREFIX=/usr/local/Cellar/bzip2/1.0.6_1
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local.

OS X already provides this software and installing another version in
parallel can cause all kinds of trouble.

Generally there are no consequences of this for you. If you build your
own software and it requires this formula, you'll need to add to your
build variables:

    LDFLAGS:  -L/usr/local/opt/bzip2/lib
    CPPFLAGS: -I/usr/local/opt/bzip2/include

==> Summary
🍺  /usr/local/Cellar/bzip2/1.0.6_1: 21 files, 397.5K, built in 5 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: libxml2
==> Downloading http://xmlsoft.org/sources/libxml2-2.9.3.tar.gz
Already downloaded: /Library/Caches/Homebrew/libxml2-2.9.3.tar.gz
==> ./configure --prefix=/usr/local/Cellar/libxml2/2.9.3 --without-python --without-lzma
==> make
==> make install
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local.

OS X already provides this software and installing another version in
parallel can cause all kinds of trouble.

Generally there are no consequences of this for you. If you build your
own software and it requires this formula, you'll need to add to your
build variables:

    LDFLAGS:  -L/usr/local/opt/libxml2/lib
    CPPFLAGS: -I/usr/local/opt/libxml2/include

==> Summary
🍺  /usr/local/Cellar/libxml2/2.9.3: 275 files, 9.7M, built in 1 minute 13 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: zlib
==> Downloading http://zlib.net/zlib-1.2.8.tar.gz

curl: (22) The requested URL returned error: 404 Not Found
Trying a mirror...
==> Downloading https://downloads.sourceforge.net/project/libpng/zlib/1.2.8/zlib-1.2.8.tar.gz
==> Downloading from https://netcologne.dl.sourceforge.net/project/libpng/zlib/1.2.8/zlib-1.2.8.tar.gz
######################################################################## 100.0%
==> Patching
patching file configure
==> ./configure --prefix=/usr/local/Cellar/zlib/1.2.8
==> make install
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local.

OS X already provides this software and installing another version in
parallel can cause all kinds of trouble.

Generally there are no consequences of this for you. If you build your
own software and it requires this formula, you'll need to add to your
build variables:

    LDFLAGS:  -L/usr/local/opt/zlib/lib
    CPPFLAGS: -I/usr/local/opt/zlib/include

==> Summary
🍺  /usr/local/Cellar/zlib/1.2.8: 9 files, 350.8K, built in 10 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: automake
==> Downloading http://ftpmirror.gnu.org/automake/automake-1.15.tar.xz
==> Downloading from http://mirror.koddos.net/gnu/automake/automake-1.15.tar.xz
######################################################################## 100.0%
==> ./configure --prefix=/usr/local/Cellar/automake/1.15
==> make install
🍺  /usr/local/Cellar/automake/1.15: 130 files, 3.2M, built in 15 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: libtar
==> Cloning http://repo.or.cz/libtar.git
Cloning into '/Library/Caches/Homebrew/libtar--git'...
remote: Counting objects: 324, done.
remote: Total 324 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (324/324), 190.99 KiB | 0 bytes/s, done.
Resolving deltas: 100% (177/177), done.
Checking connectivity... done.
Note: checking out '0907a9034eaf2a57e8e4a9439f793f3f05d446cd'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

==> Checking out tag v1.2.20
==> autoreconf --force --install
==> ./configure --prefix=/usr/local/Cellar/libtar/1.2.20 --mandir=/usr/local/Cellar/libtar/1.2.20/share/man
==> make install
🍺  /usr/local/Cellar/libtar/1.2.20: 84 files, 140.7K, built in 45 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: ticcutils
==> Downloading https://github.com/LanguageMachines/ticcutils/releases/download/v0.18/ticcutils-0.18.tar.gz
==> Downloading from https://github-production-release-asset-2e65be.s3.amazonaws.com/9028755/b65a88ba-157f-11e8-8ffd-de0d9b062976?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20180220%2Fus-east-1%2Fs3%2Faws4_re
######################################################################## 100.0%
==> ./configure --disable-silent-rules --prefix=/usr/local/Cellar/ticcutils/0.18
==> make install
Error: The `brew link` step did not complete successfully
The formula built, but is not symlinked into /usr/local
Could not symlink include/ticcutils/CommandLine.h
Target /usr/local/include/ticcutils/CommandLine.h
already exists. You may want to remove it:
  rm '/usr/local/include/ticcutils/CommandLine.h'

To force the link and overwrite all conflicting files:
  brew link --overwrite ticcutils

To list all files that would be deleted:
  brew link --overwrite --dry-run ticcutils

Possible conflicting files are:
/usr/local/include/ticcutils/CommandLine.h
/usr/local/include/ticcutils/Configuration.h
/usr/local/include/ticcutils/FileUtils.h
/usr/local/include/ticcutils/LogBuffer.h
/usr/local/include/ticcutils/LogStream.h
/usr/local/include/ticcutils/PrettyPrint.h
/usr/local/include/ticcutils/StringOps.h
/usr/local/include/ticcutils/Tar.h
/usr/local/include/ticcutils/Timer.h
/usr/local/include/ticcutils/TreeHash.h
/usr/local/include/ticcutils/Trie.h
/usr/local/include/ticcutils/UnitTest.h
/usr/local/include/ticcutils/Version.h
/usr/local/include/ticcutils/XMLtools.h
/usr/local/include/ticcutils/bz2stream.h
/usr/local/include/ticcutils/gzstream.h
/usr/local/include/ticcutils/zipper.h
/usr/local/share/man/man1/ticc_logstream.1
/usr/local/share/man/man1/ticc_prettyprint.1
/usr/local/share/man/man1/ticc_string.1
/usr/local/share/man/man1/ticc_unit_test.1
/usr/local/share/man/man1/ticcutils.1
/usr/local/lib/libticcutils.a
/usr/local/lib/libticcutils.dylib -> /usr/local/lib/libticcutils.2.dylib
/usr/local/lib/pkgconfig/ticcutils.pc
==> Summary
🍺  /usr/local/Cellar/ticcutils/0.18: 40 files, 1.5M, built in 47 seconds
==> Installing fbkarsdorp/lamachine/ucto dependency: libfolia
==> Downloading https://github.com/LanguageMachines/libfolia/releases/download/v1.12/libfolia-1.12.tar.gz
==> Downloading from https://github-production-release-asset-2e65be.s3.amazonaws.com/9030036/982b6b8a-1583-11e8-9132-19db74d39742?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20180220%2Fus-east-1%2Fs3%2Faws4_re
######################################################################## 100.0%
==> ./configure --disable-silent-rules --prefix=/usr/local/Cellar/libfolia/1.12
==> make install
==> make check
Error: The `brew link` step did not complete successfully
The formula built, but is not symlinked into /usr/local
Could not symlink bin/folialint
Target /usr/local/bin/folialint
already exists. You may want to remove it:
  rm '/usr/local/bin/folialint'

To force the link and overwrite all conflicting files:
  brew link --overwrite libfolia

To list all files that would be deleted:
  brew link --overwrite --dry-run libfolia

Possible conflicting files are:
/usr/local/bin/folialint
/usr/local/include/libfolia/folia.h
/usr/local/lib/libfolia.a
/usr/local/lib/libfolia.dylib -> /usr/local/lib/libfolia.3.dylib
/usr/local/lib/pkgconfig/folia.pc
==> Summary
🍺  /usr/local/Cellar/libfolia/1.12: 18 files, 9.6M, built in 44 seconds
==> Installing fbkarsdorp/lamachine/ucto
==> Downloading https://github.com/LanguageMachines/ucto/releases/download/v0.12/ucto-0.12.tar.gz
==> Downloading from https://github-production-release-asset-2e65be.s3.amazonaws.com/9028617/157662c4-158a-11e8-9b21-44e6e348c541?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20180220%2Fus-east-1%2Fs3%2Faws4_re
######################################################################## 100.0%
==> ./configure --disable-silent-rules --prefix=/usr/local/Cellar/ucto/0.12
==> make install
==> make check
Error: The `brew link` step did not complete successfully
The formula built, but is not symlinked into /usr/local
Could not symlink bin/ucto
Target /usr/local/bin/ucto
already exists. You may want to remove it:
  rm '/usr/local/bin/ucto'

To force the link and overwrite all conflicting files:
  brew link --overwrite ucto

To list all files that would be deleted:
  brew link --overwrite --dry-run ucto

Possible conflicting files are:
/usr/local/bin/ucto
/usr/local/include/ucto/tokenize.h
/usr/local/share/man/man1/ucto.1
/usr/local/lib/libucto.a
/usr/local/lib/libucto.dylib -> /usr/local/lib/libucto.2.dylib
/usr/local/lib/pkgconfig/ucto.pc
==> Summary
🍺  /usr/local/Cellar/ucto/0.12: 24 files, 1.4M, built in 40 seconds
evanmiltenburg commented 6 years ago

But when I run it from the command line, I get this:

Emiel$ ucto
dyld: Symbol not found: __ZNK5boost9re_detail31cpp_regex_traits_implementationIcE17transform_primaryEPKcS4_
  Referenced from: /usr/local/lib/libticcutils.2.dylib
  Expected in: /usr/local/lib/libboost_regex-mt.dylib
 in /usr/local/lib/libticcutils.2.dylib
Abort trap: 6
fbkarsdorp commented 6 years ago

Ah, yes. The old ticcutils and libfolia are still linked. Best to unlink or, even better, uninstall them before installing the new versions.

evanmiltenburg commented 6 years ago

Will try! And I suppose brew link --overwrite ucto should also work.

evanmiltenburg commented 6 years ago

Ucto works!

No out of the box support for Dutch, it seems:

Emiel$ ucto -L nl test.txt
ucto: unsupported language 'nld'
ucto: The uctodata package seems not to be installed.
ucto: You can use '-L generic' to run a simple default tokenizer.
ucto: Installing uctodata is highly recommended.

But the generic tokenizer gives the expected output:

Emiel$ ucto -L generic test.txt
ucto: inputfile = test.txt
ucto: outputfile =
ucto:tokconfig-generic: version=0.2
This is a test . <utt> How does this tool work ? <utt>
evanmiltenburg commented 6 years ago

Frog doesn't work yet: it doesn't seem to have access to the configuration files for English.

(I did run ln -s /usr/local/opt/frogdata/share/frog/ /usr/local/Cellar/frog/0.14/share, as instructed.)

Emiel$ frog --language=eng -t test.txt -o frogged.txt
frog 0.14 (c) CLTS, ILK 1998 - 2018
CLST  - Centre for Language and Speech Technology,Radboud University
ILK   - Induction of Linguistic Knowledge Research Group,Tilburg University
based on [ucto 0.12, libfolia 1.12, timbl 6.4.10, ticcutils 0.18, mbt 3.3.1]
configuration file: /usr/local/Cellar/frog/0.14/share/frog/eng/frog.cfg not found
Did you correctly install the frogdata package for language=eng?
using fallback configuration file: /usr/local/Cellar/frog/0.14/share/frog/frog.cfg
unable to read configuration from /usr/local/Cellar/frog/0.14/share/frog/frog.cfg
failed to read configuration from '/usr/local/Cellar/frog/0.14/share/frog/frog.cfg' !!
Did you correctly install the frogdata package for language=eng?
frog-:fatal error: init failed
fbkarsdorp commented 6 years ago

Let's first resolve the uctodata issue. A caveat is printed after installing uctodata, but I guess to could be clearer. Could you do a brew info uctodata and follow the instructions, @evanmiltenburg ?

fbkarsdorp commented 6 years ago

Re: frogdata. I don't think there is support for English with frog. Ucto supports many languages, though. After linking the data, that should work.

evanmiltenburg commented 6 years ago

Output:

Emiel$ brew info uctodata
fbkarsdorp/lamachine/uctodata: stable 0.5
Data for Unicode Tokenizer Ucto
https://languagemachines.github.io/ucto
Not installed
From: https://github.com/fbkarsdorp/homebrew-lamachine/blob/master/Formula/uctodata.rb
==> Dependencies
Build: pkg-config ✔
Required: ucto ✔
==> Caveats
To use the uctodata with ucto without specifying a complete path to
one of the configuration files, run this:
  ln -s /usr/local/Cellar/uctodata/0.5/share/ucto/* /usr/local/opt/ucto/share/ucto/

I initially missed the "not installed" message, so I ran the command and tested ucto again:

Emiels-MBP:~ Emiel$ ln -s /usr/local/Cellar/uctodata/0.5/share/ucto/* /usr/local/opt/ucto/share/ucto/
Emiels-MBP:~ Emiel$ ucto -L en test.txt
ucto: unsupported language 'eng'
ucto: The uctodata package seems not to be installed.
ucto: You can use '-L generic' to run a simple default tokenizer.
ucto: Installing uctodata is highly recommended.
Emiels-MBP:~ Emiel$ ucto -L eng test.txt
ucto: unsupported language 'eng'
ucto: The uctodata package seems not to be installed.
ucto: You can use '-L generic' to run a simple default tokenizer.
ucto: Installing uctodata is highly recommended.

Whoops! I thought uctodata would be installed automatically with ucto..

So I tried to install uctodata.

Emiels-MBP:~ Emiel$ brew install uctodata
Warning: You are using OS X 10.13.
We do not provide support for this pre-release version.
You may encounter build failures or other breakages.
==> Installing uctodata from fbkarsdorp/lamachine
==> Downloading https://github.com/LanguageMachines/uctodata/releases/download/v0.5/uctodata-0.5.tar.gz
==> Downloading from https://github-production-release-asset-2e65be.s3.amazonaws.com/62293517/4526a69a-b3fc-11e7-9d84-4f9ea8b54ff4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20180220%2Fus-east-1%2Fs3%2Faws4_r
######################################################################## 100.0%
==> ./configure --prefix=/usr/local/Cellar/uctodata/0.5 --disable-silent-rules
==> make install
Error: An unexpected error occurred during the `brew link` step
The formula built, but is not symlinked into /usr/local
No such file or directory @ realpath_rec - /usr/local/Cellar/uctodata/0.5/share/ucto/*
Error: No such file or directory @ realpath_rec - /usr/local/Cellar/uctodata/0.5/share/ucto/*

That doesn't sound good.

fbkarsdorp commented 6 years ago

Strange. Is there something in /usr/local/opt/uctodata/share/ucto/?

If so, do

ln -s /usr/local/opt/uctodata/share/ucto/* /usr/local/opt/ucto/share/ucto/

evanmiltenburg commented 6 years ago

RE: frog, I tried: frog --language=nld -t test.txt -o frogged.txt

Output:

frog 0.14 (c) CLTS, ILK 1998 - 2018
CLST  - Centre for Language and Speech Technology,Radboud University
ILK   - Induction of Linguistic Knowledge Research Group,Tilburg University
based on [ucto 0.12, libfolia 1.12, timbl 6.4.10, ticcutils 0.18, mbt 3.3.1]
frog-:config read from: /usr/local/Cellar/frog/0.14/share/frog/nld/frog.cfg
frog-:configuration version = 0.12
frog-tok-:Language List =[nld]
frog-tok-:Initiating tokeniser...
frog-tok-:Cannot read Tokeniser settingsfile tokconfig-nld
frog-tok-:Unsupported language? (Did you install the uctodata package?)
frog-mblem:Initiating lemmatizer...
frog-mbma-:Initiating morphological analyzer...
frog-tagger-tagger-mbt-:  Reading the lexicon from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.lex.ambi.05 (229170 words).
frog-tagger-tagger-mbt-:  Read frequent words list from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.top500 (500 words).
frog-tagger-tagger-mbt-:  Reading case-base for known words from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.known.dddwfWawa...
frog-tagger-tagger-mbt-:  case-base for known words read.
frog-tagger-tagger-mbt-:  Reading case-base for unknown words from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.unknown.chnppdddwFawasss...
frog-tagger-tagger-mbt-:  case-base for unknown word read
frog-tagger-tagger-mbt-:  Sentence delimiter set to '<utt>'
frog-tagger-tagger-mbt-:  Beam size = 1
frog-tagger-tagger-mbt-:  Known Tree, Algorithm = IGTREE
frog-tagger-tagger-mbt-:  Unknown Tree, Algorithm = IB1
frog-tagger-tagger-mbt-:
frog-IOB-tagger-mbt-:  Reading the lexicon from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.lex.ambi.05 (78570 words).
frog-IOB-tagger-mbt-:  Read frequent words list from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.top200 (200 words).
frog-IOB-tagger-mbt-:  Reading case-base for known words from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.known.dddwfWawa...
frog-IOB-tagger-mbt-:  case-base for known words read.
frog-IOB-tagger-mbt-:  Reading case-base for unknown words from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.unknown.chnppddwFawasss...
frog-IOB-tagger-mbt-:  case-base for unknown word read
frog-IOB-tagger-mbt-:  Sentence delimiter set to '<utt>'
frog-IOB-tagger-mbt-:  Beam size = 1
frog-IOB-tagger-mbt-:  Known Tree, Algorithm = TRIBL2
frog-IOB-tagger-mbt-:  Unknown Tree, Algorithm = IB1
frog-IOB-tagger-mbt-:
frog-NER-tagger-mbt-:  Reading the lexicon from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.lex.ambi.05 (73735 words).
frog-NER-tagger-mbt-:  Read frequent words list from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.top1000 (1000 words).
frog-NER-tagger-mbt-:  Reading case-base for known words from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.known.ddwdwfWawaa...
frog-NER-tagger-mbt-:  case-base for known words read.
frog-NER-tagger-mbt-:  Reading case-base for unknown words from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.unknown.chnppddwdwFawawasss...
frog-NER-tagger-mbt-:  case-base for unknown word read
frog-NER-tagger-mbt-:  Sentence delimiter set to 'EL'
frog-NER-tagger-mbt-:  Beam size = 1
frog-NER-tagger-mbt-:  Known Tree, Algorithm = TRIBL2
frog-NER-tagger-mbt-:  Unknown Tree, Algorithm = TRIBL
frog-NER-tagger-mbt-:
frog-NER-tagger-:READ  /usr/local/Cellar/frog/0.14/share/frog/nld//ners.known
frog-NER-tagger-:loaded 62254 additional per Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//voornamen.ner
frog-NER-tagger-:loaded 124199 additional per Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//familienamen.ner
frog-NER-tagger-:loaded 40 additional eve Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//eve.ner
frog-NER-tagger-:loaded 8 additional loc Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//loc.ner
frog-NER-tagger-:loaded 20162 additional loc Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//geonames-be.ner
frog-NER-tagger-:loaded 22625 additional loc Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//geonames-nl.ner
frog-NER-tagger-:loaded 390 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//gemeenten-nld.ner
frog-NER-tagger-:loaded 13 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//provincies-nld.ner
frog-NER-tagger-:loaded 133 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//regios-nld.ner
frog-NER-tagger-:loaded 33 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//waterschappen-nld.ner
frog-NER-tagger-:loaded 10 additional Named Entities files
frog-mwu-:initiating mwuChunker...
frog-mwu-:read mwus /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mwu.1.0
frog-parser-:initiating parser ...
frog-parser-:reading /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mbdp.1.0.pairs.sampled.ibase
frog-parser-:reading /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mbdp.1.0.dir.ibase
frog-parser-:reading /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mbdp.1.0.rels.ibase
frog-:init Parse took: 3 seconds, 353 milliseconds and 293 microseconds
frog-:Initialization failed for: [tokenizer]
frog-:fatal error: Frog init failed

So once the Ucto issue is resolved, it should work! Note that language has to be nld and not nl, because otherwise it fails.

evanmiltenburg commented 6 years ago

There is ucto but not uctodata in /usr/local/opt/.

fbkarsdorp commented 6 years ago

And in /usr/local/Cellar/ ?

evanmiltenburg commented 6 years ago

I have /usr/local/Cellar/uctodata/0.5, which contains the following:

AUTHORS         COPYING         ChangeLog       INSTALL_RECEIPT.json    NEWS            README          TODO            lib         share
fbkarsdorp commented 6 years ago

right. Then try:

ln -s /usr/local/Cellar/uctodata/0.5/share/ucto/* /usr/local/opt/ucto/share/ucto/

fbkarsdorp commented 6 years ago

I have to look into why uctodata failed to auto-link to opt.

evanmiltenburg commented 6 years ago

Seems to work:

ucto -L eng test.txt
ucto: inputfile = test.txt
ucto: outputfile =
ucto:tokconfig-eng: version=0.2
This is a test . <utt> Would UCTO work ? <utt>
fbkarsdorp commented 6 years ago

Yes. And now frog should work too.

evanmiltenburg commented 6 years ago

It works! Here is the log, just for reference:

Emiel$ frog --language=nld -t test.txt -o frogged.txt
frog 0.14 (c) CLTS, ILK 1998 - 2018
CLST  - Centre for Language and Speech Technology,Radboud University
ILK   - Induction of Linguistic Knowledge Research Group,Tilburg University
based on [ucto 0.12, libfolia 1.12, timbl 6.4.10, ticcutils 0.18, mbt 3.3.1]
frog-:config read from: /usr/local/Cellar/frog/0.14/share/frog/nld/frog.cfg
frog-:configuration version = 0.12
frog-tok-:Language List =[nld]
frog-tok-:Initiating tokeniser...
frog-tok-:tokconfig-nld: version=0.2
frog-mblem:Initiating lemmatizer...
frog-mbma-:Initiating morphological analyzer...
frog-tagger-tagger-mbt-:  Reading the lexicon from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.lex.ambi.05 (229170 words).
frog-tagger-tagger-mbt-:  Read frequent words list from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.top500 (500 words).
frog-tagger-tagger-mbt-:  Reading case-base for known words from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.known.dddwfWawa...
frog-tagger-tagger-mbt-:  case-base for known words read.
frog-tagger-tagger-mbt-:  Reading case-base for unknown words from: /usr/local/Cellar/frog/0.14/share/frog/nld/Frog.mbt.1.0.unknown.chnppdddwFawasss...
frog-tagger-tagger-mbt-:  case-base for unknown word read
frog-tagger-tagger-mbt-:  Sentence delimiter set to '<utt>'
frog-tagger-tagger-mbt-:  Beam size = 1
frog-tagger-tagger-mbt-:  Known Tree, Algorithm = IGTREE
frog-tagger-tagger-mbt-:  Unknown Tree, Algorithm = IB1
frog-tagger-tagger-mbt-:
frog-IOB-tagger-mbt-:  Reading the lexicon from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.lex.ambi.05 (78570 words).
frog-IOB-tagger-mbt-:  Read frequent words list from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.top200 (200 words).
frog-IOB-tagger-mbt-:  Reading case-base for known words from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.known.dddwfWawa...
frog-IOB-tagger-mbt-:  case-base for known words read.
frog-IOB-tagger-mbt-:  Reading case-base for unknown words from: /usr/local/Cellar/frog/0.14/share/frog/nld/chunkgen.data.unknown.chnppddwFawasss...
frog-IOB-tagger-mbt-:  case-base for unknown word read
frog-IOB-tagger-mbt-:  Sentence delimiter set to '<utt>'
frog-IOB-tagger-mbt-:  Beam size = 1
frog-IOB-tagger-mbt-:  Known Tree, Algorithm = TRIBL2
frog-IOB-tagger-mbt-:  Unknown Tree, Algorithm = IB1
frog-IOB-tagger-mbt-:
frog-NER-tagger-mbt-:  Reading the lexicon from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.lex.ambi.05 (73735 words).
frog-NER-tagger-mbt-:  Read frequent words list from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.top1000 (1000 words).
frog-NER-tagger-mbt-:  Reading case-base for known words from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.known.ddwdwfWawaa...
frog-NER-tagger-mbt-:  case-base for known words read.
frog-NER-tagger-mbt-:  Reading case-base for unknown words from: /usr/local/Cellar/frog/0.14/share/frog/nld/nergen.data.unknown.chnppddwdwFawawasss...
frog-NER-tagger-mbt-:  case-base for unknown word read
frog-NER-tagger-mbt-:  Sentence delimiter set to 'EL'
frog-NER-tagger-mbt-:  Beam size = 1
frog-NER-tagger-mbt-:  Known Tree, Algorithm = TRIBL2
frog-NER-tagger-mbt-:  Unknown Tree, Algorithm = TRIBL
frog-NER-tagger-mbt-:
frog-NER-tagger-:READ  /usr/local/Cellar/frog/0.14/share/frog/nld//ners.known
frog-NER-tagger-:loaded 62254 additional per Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//voornamen.ner
frog-NER-tagger-:loaded 124199 additional per Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//familienamen.ner
frog-NER-tagger-:loaded 40 additional eve Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//eve.ner
frog-NER-tagger-:loaded 8 additional loc Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//loc.ner
frog-NER-tagger-:loaded 20162 additional loc Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//geonames-be.ner
frog-NER-tagger-:loaded 22625 additional loc Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//geonames-nl.ner
frog-NER-tagger-:loaded 390 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//gemeenten-nld.ner
frog-NER-tagger-:loaded 13 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//provincies-nld.ner
frog-NER-tagger-:loaded 133 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//regios-nld.ner
frog-NER-tagger-:loaded 33 additional org Named Entities from: /usr/local/Cellar/frog/0.14/share/frog/nld//waterschappen-nld.ner
frog-NER-tagger-:loaded 10 additional Named Entities files
frog-mwu-:initiating mwuChunker...
frog-mwu-:read mwus /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mwu.1.0
frog-parser-:initiating parser ...
frog-parser-:reading /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mbdp.1.0.pairs.sampled.ibase
frog-parser-:reading /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mbdp.1.0.dir.ibase
frog-parser-:reading /usr/local/Cellar/frog/0.14/share/frog/nld//Frog.mbdp.1.0.rels.ibase
frog-:init Parse took: 3 seconds, 307 milliseconds and 775 microseconds
frog-:Tue Feb 20 12:44:26 2018 Initialization done.
frog-:Tue Feb 20 12:44:26 2018 Frogging test.txt
frog-:Tue Feb 20 12:44:26 2018 process 2 sentences
frog-:Tue Feb 20 12:44:26 2018 done with sentence[1]
frog-:Tue Feb 20 12:44:27 2018 done with sentence[2]
frog-:tokenisation took:  0 seconds, 7 milliseconds and 704 microseconds
frog-:CGN tagging took:   0 seconds, 274 milliseconds and 728 microseconds
frog-:IOB chunking took:  0 seconds, 315 milliseconds and 397 microseconds
frog-:NER took:           0 seconds, 177 milliseconds and 301 microseconds
frog-:MBMA took:          0 seconds, 5 milliseconds and 978 microseconds
frog-:Mblem took:         0 seconds, 0 milliseconds and 860 microseconds
frog-:MWU resolving took: 0 seconds, 0 milliseconds and 79 microseconds
frog-:Parsing (prepare) took: 0 seconds, 0 milliseconds and 84 microseconds
frog-:Parsing (pairs)   took: 0 seconds, 2 milliseconds and 489 microseconds
frog-:Parsing (rels)    took: 0 seconds, 1 milliseconds and 131 microseconds
frog-:Parsing (dir)     took: 0 seconds, 1 milliseconds and 744 microseconds
frog-:Parsing (csi)     took: 0 seconds, 2 milliseconds and 748 microseconds
frog-:Parsing (total)   took: 0 seconds, 8 milliseconds and 355 microseconds
frog-:Frogging in total took: 0 seconds, 785 milliseconds and 334 microseconds
frog-:results stored in frogged.txt
frog-:Tue Feb 20 12:44:27 2018 Frog finished

And the output:

Emiels-MBP:~ Emiel$ cat frogged.txt
1   Dit dit [dit]   VNW(aanw,pron,stan,vol,3o,ev)   0.777085    O   B-NP    2   su
2   is  zijn    [zijn]  WW(pv,tgw,ev)   0.999891    O   B-VP    0   ROOT
3   een een [een]   LID(onbep,stan,agr) 0.999113    O   B-NP    4   det
4   test    test    [test]  N(soort,ev,basis,zijd,stan) 0.903055    O   I-NP    2   predc
5   .   .   [.] LET()   1.000000    O   O   4   punct

1   Zou zullen  [zal]   WW(pv,verl,ev)  0.997499    O   B-VP    0   ROOT
2   Frog    Frog    [Frog]  SPEC(deeleigen) 1.000000    B-PER   B-NP    1   su
3   nu  nu  [nu]    BW()    0.998985    O   B-ADVP  5   mod
4   ook ook [ook]   BW()    0.999979    O   B-ADVP  3   mod
5   werken  werken  [werk][en]  WW(inf,vrij,zonder) 0.968750    O   B-VP    1   vc
6   ?   ?   [?] LET()   1.000000    O   O   5   punct

Thanks for your help! Hopefully this will help others too.

fbkarsdorp commented 6 years ago

Gonna close this now. Since we digressed a bit. I'll collect some of the good points. Thanks a lot for your help.