ontonotes / conll-formatted-ontonotes-5.0

This is a CoNLL formatted version of the OntoNotes 5.0 release.
190 stars 105 forks source link

Using OntoNotes 5.0 to generate coNLL files #2

Open vrian opened 6 years ago

vrian commented 6 years ago

Description I am currently stucked in running the script at #3 under "Steps for assembling the data" section from the instruction. As a newbie, I have little understanding of converting the skeleton files to conll files. Here are the commands specified in the guide:

skeleton2conll.sh -D /path/to/ontonotes-v5.0-release/data/files/data] [path/to/conll-formatted-ontonotes-5.0]

Result

Here is my command where I followed the script. image

Here is the output in case image wont load:

$ ".\conll-formatted-ontonotes-5.0\scripts\skeleton2conll.sh" -D ".\ontonotes-release-5.0\data\files\data\" ".\conll-formatted-ontonotes-5.0\" please make sure that you are pointing to the directory 'conll-formatted-ontonotes-5.0'

Data + Script OntoNotes 5.0 from LDC (thru email) CoNLL-formatted OntoNotes 5.0 scripts (from the same website)

Steps to reproduce

  1. Download the data and script
  2. Extract the data and script (I placed the 'scripts' folder inside 'conll-formatted-ontonotes-5.0')
  3. Run the command skeleton2conll.sh -D /path/to/ontonotes-v5.0-release/data/files/data] [path/to/conll-formatted-ontonotes-5.0]

Build/Platform Windows 10 Git Bash (mingw64) python 3.6 cpu (no CUDA)

pjox commented 6 years ago

The script skeleton2conll.py is written in Python 2 and will not run with Python 3.6, try again with Python 2, that worked for me.

wxj183 commented 5 years ago

The script skeleton2conll.py is written in Python 2 and will not run with Python 3.6, try again with Python 2, that worked for me.

i run it ,there is no error,but it can not output gold_conll files.Can you tell me how you do it. run the command ./conll-formatted-ontonotes-5.0/scripts/skeleton2conll.sh -D /ontonotes-release-5.0/data/files/data/
/conll-formatted-ontonotes-5.0/

JackMaYY commented 5 years ago

The script skeleton2conll.py is written in Python 2 and will not run with Python 3.6, try again with Python 2, that worked for me.

i run it ,there is no error,but it can not output gold_conll files.Can you tell me how you do it. run the command ./conll-formatted-ontonotes-5.0/scripts/skeleton2conll.sh -D /ontonotes-release-5.0/data/files/data/ /conll-formatted-ontonotes-5.0/ First, u can also use python 3. However, you should modify the skeleton2conll.py a little, such as exception handling (it just modify only one line of skeleton2conll.py) and all "print" to "print()" Second, you should run bash skeleton2conll.sh -D /ontonotes-release-5.0/data/files/data/ /path/to/conll-2012 in the directory including the scripts.

marc88 commented 5 years ago

[path/to/conll-formatted-ontonotes-5.0] this is actually the paths to the *_skel files within your corpus. This isn't the destination directory.

"[path/to/conll-formatted-ontonotes-5.0]: 
The top-level directory of the package downloaded from this webpage 
inside which the *_skel files exist that need to be converted to *_conll files."

Ref: http://cemantix.org/data/ontonotes.html

Do you have the list of paths where the *_skel files are located?

Regards

BloodSource commented 5 years ago

Where do I find this skeleton2conll.py file? Thanks in advance

TJKlein commented 5 years ago

Where do I find this skeleton2conll.py file? Thanks in advance

I found the scripts here