Closed GoogleCodeExporter closed 9 years ago
I'll handle this one.
Original comment by l...@dcs.shef.ac.uk
on 2 Aug 2011 at 8:11
I'm working on step-by-step instructions for Cygwin installation, which should
be available shortly.
Original comment by l...@dcs.shef.ac.uk
on 2 Aug 2011 at 8:29
Hi,
Instructions are below. Sorry if they are complex - I've tested them twice on a
Windows 7 installation of cygwin 1.7 and they are working as far as I can see -
please add to this issue if you have any problems or feedback!
Here are installation instructions for CAVaT under Cygwin v1.7. We assume
a default installation.
To begin the process, you will need to add the following packages to your
Cygwin, using the setup.exe utility (e.g. http://cygwin.com/setup.exe):
In Web:
- wget
In Python:
- python
- python-numpy
These may have some dependencies; install those too. The Cygwin downloader
will complete the installation of these.
We will need Python "setuptools" in order to complete the installation, which
is not provided as a Cygwin package. In order to install it, first download
the egg package for your version of Python from:
http://pypi.python.org/pypi/setuptools#files
E.g., for Python 2.7, choose setuptools0.6c11-py2.7.egg . Save this file into
your Cygwin directory tree, and then execute it:
wget http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg
sh setuptools-0.6c11-py2.7.egg
This will install Python's own package management system.
Next, install pyparsing. The package installation documentation can be found
at http://pyparsing.wikispaces.com/Download+and+Installation. We'll install it
using setuptools, using the following command:
easy_install pyparsing
We'll also need to install the Python YAML tools, using setuptools:
easy_install pyyaml
Finally, we need to install NLTK. Setuptools doesn't currently support nltk,
and there is no Cygwin package, so a brief manual build is required. Visit
http://code.google.com/p/nltk/downloads/list to see the current list of
downloads; a known-good version is 2.0b8, so you can save the following file
into your Cygwin installation, and unpack it:
wget http://nltk.googlecode.com/files/nltk-2.0b8.tar.gz
tar zxf nltk-2.0b8.tar.gz
Then, run "python setup.py install" inside the NLTK folder:
cd nltk-2.0b8
python setup.py install
After a small amount of copying, the installation will complete, and you can
now change directory to where you have unpacked cavat and execute it.
cd ~/cavat
./cavat.py
Original comment by l...@dcs.shef.ac.uk
on 2 Aug 2011 at 8:52
By the way, Cygwin 1.7 comes with Python 2.6.5, so the instruction work better
if you use this for setuptools:
wget http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.6.egg
sh setuptools-0.6c11-py2.6.egg
Original comment by l...@dcs.shef.ac.uk
on 2 Aug 2011 at 9:07
sorry -
wget http://pypi.python.org/packages/2.6/s/setuptools/setuptools-0.6c11-py2.6.egg
sh setuptools-0.6c11-py2.6.egg
Original comment by l...@dcs.shef.ac.uk
on 2 Aug 2011 at 9:42
Hi again
This is still a build issue, so I am adding it to the previous pyparsing
issue.
I installed and built missing portions the way your instructions descibed.
I then wanted to import a one file .tml corpus to test cavat out. I seem
not to have all the resources. specifically I dont seem to have
"tokenizers/punkt/english.pickle"
I do have nltk-2.0b8, which running "python setup.py install" built lots of
subdirectories into -- but none called "tokenizers" (though there is a
"tokenizer" directory)
Here is what I get when trying to import
(1st I did this)
/cygdrive/c/Program Files/cavat
$ python cavat.py
# CAVaT Corpus Analysis and Validation for TimeML
# Version: 0.22 Support: leon@dcs.shef.ac.uk
(2nd I did this)
cavat> corpus import TinyCorpus/ to test
Traceback (most recent call last):
File "cavat.py", line 482, in <module>
import importTimeML
File "/cygdrive/c/Program Files/cavat/importTimeML.py", line 32, in
<module>
class ImportTimeML:
File "/cygdrive/c/Program Files/cavat/importTimeML.py", line 54, in
ImportTimeML
sentenceDetector = nltk.data.load('tokenizers/punkt/english.pickle')
File "/usr/lib/python2.6/site-packages/nltk/data.py", line 590, in load
resource_val = pickle.load(_open(resource_url))
File "/usr/lib/python2.6/site-packages/nltk/data.py", line 669, in _open
return find(path).open()
File "/usr/lib/python2.6/site-packages/nltk/data.py", line 451, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource 'tokenizers/punkt/english.pickle' not found. Please
use the NLTK Downloader to obtain the resource: >>>
nltk.download().
Searched in:
- '/home/ruth/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
**********************************************************************
/cygdrive/c/Program Files/cavat
I looked at the nltk documentation, and it sure seems to me that I have done
the right things already. I tried putting the downloaded nltk directory in
/usr/share/ and in /usr/local/lib and finally tried putting it very high up
-- above the lib/ directory where Cygwin puts Python2.6. I then tried
downloading nltk-2.0b9 (instead of nltk-2.0b8) and fared no better.
I am not sure what else to try. Is there something obvious I am missing?
Thanks in advance
Ruth
Original comment by ruth.m.r...@gmail.com
on 11 Aug 2011 at 11:43
Hi,
Sorry for the delay. This issue occurs because the default NLTK installation
doesn't provide the required tokenizer. You can install it from the command
prompt with the following command:
python -m nltk.downloader punkt
Hope this helps; please mention if there are any other missing resources - we
can integrate them into a smoother Cygwin installation process for the future.
Original comment by l...@dcs.shef.ac.uk
on 16 Aug 2011 at 11:51
Original comment by leonderczynski
on 13 Oct 2011 at 12:29
Original issue reported on code.google.com by
ruth.m.r...@gmail.com
on 2 Aug 2011 at 8:04