nusnlp / crosentgec

Code for cross-sentence grammatical error correction using multilayer convolutional seq2seq models (ACL 2019)
GNU General Public License v3.0
50 stars 12 forks source link

Environment setup for trained model #1

Closed harrydeng8 closed 5 years ago

shamilcm commented 5 years ago

The README file details the instructions to download and run the models. One 1080Ti should be enough to run the pre-trained models.

harrydeng8 commented 5 years ago

I was able to download the model but had trouble to prepare the dataset.

Is there special consideration to install Python 2.7 and NLTK v2.0b7 and LangID.py v1.1.6?

Best,

Harry

From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 6, 2019 6:35 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

Closed #1 https://github.com/nusnlp/crosentgec/issues/1 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYU3DI3USXSLTC5ZCLLQDIREDA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOS5HQNYY#event-2538538723 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYVZVHLESZDLP2XECXTQDIREDANCNFSM4IJEWK5A .

shamilcm commented 5 years ago

NLTK 2.0 is required for the exact tokenizer. Newer NLTK versions may tokenize differently which may result in different scores. LangID is required to identify English essays and sentences from Lang-8

On Wed, 7 Aug 2019 at 11:16 AM, harrydeng8 notifications@github.com wrote:

I was able to download the model but had trouble to prepare the dataset.

Is there special consideration to install Python 2.7 and NLTK v2.0b7 and LangID.py v1.1.6?

Best,

Harry

From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 6, 2019 6:35 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

Closed #1 https://github.com/nusnlp/crosentgec/issues/1 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYU3DI3USXSLTC5ZCLLQDIREDA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOS5HQNYY#event-2538538723> , or mute the thread < https://github.com/notifications/unsubscribe-auth/ALTSWYVZVHLESZDLP2XECXTQDIREDANCNFSM4IJEWK5A> .

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=AAE46MCBPMFFUMADLRDAF6TQDI5B7A5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3XCGRQ#issuecomment-518923078, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE46MFGZA3UPRNZCDPVZ73QDI5B7ANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

We are able to install python2.7 with the following commands.

Could we use the following commands to install NLTK2.0 under python2.7?

If not, how do we make sure to install NLTK2.0?

Thanks!

Harry

sudo apt-get update

sudo apt-get install build-essential checkinstall

sudo apt-get install libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev

cd /usr/src

sudo wget https://www.python.org/ftp/python/2.7.16/Python-2.7.16.tgz

sudo tar xzf Python-2.7.16.tgz

cd Python-2.7.16

sudo ./configure --enable-optimizations

sudo make altinstall

python2.7 -V

sudo apt-get install python-pip

sudo pip install -U nltk

  1. import nltk

  2. nltk.download()


From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 6, 2019 10:46 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

NLTK 2.0 is required for the exact tokenizer. Newer NLTK versions may tokenize differently which may result in different scores. LangID is required to identify English essays and sentences from Lang-8

On Wed, 7 Aug 2019 at 11:16 AM, harrydeng8 notifications@github.com wrote:

I was able to download the model but had trouble to prepare the dataset.

Is there special consideration to install Python 2.7 and NLTK v2.0b7 and LangID.py v1.1.6?

Best,

Harry

From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 6, 2019 6:35 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

Closed #1 https://github.com/nusnlp/crosentgec/issues/1 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYU3DI3USXSLTC5ZCLLQDIREDA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOS5HQNYY#event-2538538723> , or mute the thread < https://github.com/notifications/unsubscribe-auth/ALTSWYVZVHLESZDLP2XECXTQDIREDANCNFSM4IJEWK5A> .

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=AAE46MCBPMFFUMADLRDAF6TQDI5B7A5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3XCGRQ#issuecomment-518923078, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE46MFGZA3UPRNZCDPVZ73QDI5B7ANCNFSM4IJEWK5A .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYRT6H3ONI44X53T6YLQDJOSLA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3XIVRY#issuecomment-518949575 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYTIKSVFDKZBHAPI2G3QDJOSLANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

I am sorry to trouble you but we ran into problems installing NKTK2.0b7, see below. How did you install NLTK-2.0b7 into ubuntu 18.04?

Running setup.py (path:/tmp/pip-build-HX6gJc/nltk/setup.py) egg_info for package nltk

Running command python setup.py egg_info

Downloading http://pypi.python.org/packages/source/d/distribute/distribute-0.6.21.tar.gz

Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/tmp/pip-build-HX6gJc/nltk/setup.py", line 23, in <module>

    distribute_setup.use_setuptools()

  File "distribute_setup.py", line 145, in use_setuptools

    return _do_download(version, download_base, to_dir, download_delay)

  File "distribute_setup.py", line 124, in _do_download

    to_dir, download_delay)

  File "distribute_setup.py", line 193, in download_setuptools

    src = urlopen(url)

  File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen

    return opener.open(url, data, timeout)

  File "/usr/lib/python2.7/urllib2.py", line 435, in open

    response = meth(req, response)

  File "/usr/lib/python2.7/urllib2.py", line 548, in http_response

    'http', request, response, code, msg, hdrs)

  File "/usr/lib/python2.7/urllib2.py", line 473, in error

    return self._call_chain(*args)

  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain

    result = func(*args)

  File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default

    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)

urllib2.HTTPError: HTTP Error 403: SSL is required

Cleaning up...


From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 6, 2019 10:46 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

NLTK 2.0 is required for the exact tokenizer. Newer NLTK versions may tokenize differently which may result in different scores. LangID is required to identify English essays and sentences from Lang-8

On Wed, 7 Aug 2019 at 11:16 AM, harrydeng8 notifications@github.com wrote:

I was able to download the model but had trouble to prepare the dataset.

Is there special consideration to install Python 2.7 and NLTK v2.0b7 and LangID.py v1.1.6?

Best,

Harry

From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 6, 2019 6:35 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

Closed #1 https://github.com/nusnlp/crosentgec/issues/1 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYU3DI3USXSLTC5ZCLLQDIREDA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOS5HQNYY#event-2538538723> , or mute the thread < https://github.com/notifications/unsubscribe-auth/ALTSWYVZVHLESZDLP2XECXTQDIREDANCNFSM4IJEWK5A> .

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=AAE46MCBPMFFUMADLRDAF6TQDI5B7A5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3XCGRQ#issuecomment-518923078, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE46MFGZA3UPRNZCDPVZ73QDI5B7ANCNFSM4IJEWK5A .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYRT6H3ONI44X53T6YLQDJOSLA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3XIVRY#issuecomment-518949575 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYTIKSVFDKZBHAPI2G3QDJOSLANCNFSM4IJEWK5A .

shamilcm commented 5 years ago

The data preparation was done on an old environment and followed previous work. It can be challenging now to install NLTK 2.0 as several dependencies are unavailable via pip. Could you try the following commands to see if it works for you:

pip install setuptools==9.1
pip install https://pyyaml.org/download/pyyaml/PyYAML-3.08.tar.gz
pip install nltk==2.0b7
harrydeng8 commented 5 years ago

Thank you so much that I am able to install nltk2.0b7 as root user.

I ran into nltk data download from python2. Could this be caused by mine using root account to install nltk==2.0b7?

From: Shamil Chollampatt notifications@github.com Sent: Wednesday, August 7, 2019 7:32 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The data preparation was done on an old environment and followed previous work. It can be challenging to install NLTK 2.0 as several dependencies are unavailable via pip. Could you try the following commands to see if it works for you:

pip install setuptools==9.1 pip install https://pyyaml.org/download/pyyaml/PyYAML-3.08.tar.gz pip install nltk==2.0b7

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYQFH3E5YGISKMUKS7TQDOARRA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32HVTA#issuecomment-519338700 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYVQTZSGYALVCLKZP73QDOARRANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

I am having trouble installing langid, see below. Any idea why? Thank!

Collecting langid

Downloading https://files.pythonhosted.org/packages/ea/4c/0fb7d900d3b0b9c8703be316fbddffecdab23c64e1b46c7a83561d78bd43/langid-1.1.6.tar.gz (1.9MB)

100% |████████████████████████████████| 1.9MB 110kB/s

Collecting numpy (from langid)

Downloading https://files.pythonhosted.org/packages/da/32/1b8f2bb5fb50e4db68543eb85ce37b9fa6660cd05b58bddfafafa7ed62da/numpy-1.17.0.zip (6.5MB)

100% |████████████████████████████████| 6.5MB 58kB/s

Complete output from command python setup.py egg_info:

Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/tmp/pip-build-uot2cb/numpy/setup.py", line 31, in <module>

    raise RuntimeError("Python version >= 3.5 required.")

RuntimeError: Python version >= 3.5 required.

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-uot2cb/numpy/

You are using pip version 8.1.1, however version 19.2.1 is available.

You should consider upgrading via the 'pip install --upgrade pip' command.


From: Shamil Chollampatt notifications@github.com Sent: Wednesday, August 7, 2019 7:32 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The data preparation was done on an old environment and followed previous work. It can be challenging to install NLTK 2.0 as several dependencies are unavailable via pip. Could you try the following commands to see if it works for you:

pip install setuptools==9.1 pip install https://pyyaml.org/download/pyyaml/PyYAML-3.08.tar.gz pip install nltk==2.0b7

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYQFH3E5YGISKMUKS7TQDOARRA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32HVTA#issuecomment-519338700 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYVQTZSGYALVCLKZP73QDOARRANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

I am having to manually download nltk data.

What are the required packages to run your trained dataset?

Thanks,

Harry

From: Shamil Chollampatt notifications@github.com Sent: Wednesday, August 7, 2019 7:32 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The data preparation was done on an old environment and followed previous work. It can be challenging to install NLTK 2.0 as several dependencies are unavailable via pip. Could you try the following commands to see if it works for you:

pip install setuptools==9.1 pip install https://pyyaml.org/download/pyyaml/PyYAML-3.08.tar.gz pip install nltk==2.0b7

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYQFH3E5YGISKMUKS7TQDOARRA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32HVTA#issuecomment-519338700 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYVQTZSGYALVCLKZP73QDOARRANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

Hi, Shamill,

I got NLTK_data installed now. How will you suggest the best process we do to run our own test sentences, especially if only one sentence is involved?

Thanks,

Harry

From: Shamil Chollampatt notifications@github.com Sent: Wednesday, August 7, 2019 7:32 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The data preparation was done on an old environment and followed previous work. It can be challenging to install NLTK 2.0 as several dependencies are unavailable via pip. Could you try the following commands to see if it works for you:

pip install setuptools==9.1 pip install https://pyyaml.org/download/pyyaml/PyYAML-3.08.tar.gz pip install nltk==2.0b7

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYQFH3E5YGISKMUKS7TQDOARRA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32HVTA#issuecomment-519338700 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYVQTZSGYALVCLKZP73QDOARRANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

Hi, Shamil,

I run the “ https://github.com/nusnlp/crosentgec/blob/master/download_pretrained_crosent.sh download_pretrained_crosent.sh” but could not find the dict directory created.

What should we specify for the disctionary location when running the trained model please?

Thank you so much!

Harry

From: Shamil Chollampatt notifications@github.com Sent: Wednesday, August 7, 2019 7:32 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The data preparation was done on an old environment and followed previous work. It can be challenging to install NLTK 2.0 as several dependencies are unavailable via pip. Could you try the following commands to see if it works for you:

pip install setuptools==9.1 pip install https://pyyaml.org/download/pyyaml/PyYAML-3.08.tar.gz pip install nltk==2.0b7

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYQFH3E5YGISKMUKS7TQDOARRA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32HVTA#issuecomment-519338700 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYVQTZSGYALVCLKZP73QDOARRANCNFSM4IJEWK5A .

shamilcm commented 5 years ago

The dictionaries are downloaded by download.sh script. Specify path as the directory which contains the dictionaries.

harrydeng8 commented 5 years ago

I found that decoder.sh uses both python2.7 and python3, see below. Please confirm.


CUDA_VISIBLE_DEVICES=$DEVICE python $FAIRSEQPY/interactive_multi.py --no-progress-bar --path $models --beam $beam --nbest $beam --replace-unk --source-lang src --target-lang trg --input-files $TMP_DIR/input.src $TMP_DIR/input.ctx --num-shards $threads --task translation_ctx $DATA_DIR > $OUTPUT

cat $OUTPUT | grep "^H" | python3 -c "import sys; x = sys.stdin.readlines(); x = ' '.join([ x[i] for i in range(len(x)) if i%$nbest == 0 ]); print(x)" | cut -f3 | sed 's|@@ ||g' | sed '$ d' > $OUTPUT.txt

echo "END TIME:date" | tee -a $LOGFILE

From: Shamil Chollampatt notifications@github.com Sent: Monday, August 12, 2019 6:33 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The dictionaries are downloaded by download.sh script. Specify path as the directory which contains the dictionaries.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYUEYI2HZVVPZG2QS4LQEIFNJA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4EI3CY#issuecomment-520654219 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYTERDCERZVPH4ZEOYLQEIFNJANCNFSM4IJEWK5A .

shamilcm commented 5 years ago

The scripts assume that the default python environment, referred by the command python, is python3. So the above two lines are actually referring to python3 itself.

harrydeng8 commented 5 years ago

Do we need to install NLTK for python3.6 for decoder.sh to work?

Which version of NLTK could we install with Python3.6?

Thanks!

From: Shamil Chollampatt notifications@github.com Sent: Tuesday, August 13, 2019 7:56 AM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The scripts assume that the default python environment, referred by the command python, is python3. So the above two lines are actually referring to python3 itself.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYWPFOGH6UX53TU6CL3QELDOTA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4F5SJI#issuecomment-520870181 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYRVI6W4IEUE4JZFT7DQELDOTANCNFSM4IJEWK5A .

harrydeng8 commented 5 years ago

I am getting some errors after running: ./decode.sh conll13st-test models/crosent/model1 models/dicts 1

**includes*****

Traceback (most recent call last):

File "fairseq/interactive_multi.py", line 195, in

main(args)

File "fairseq/interactive_multi.py", line 102, in main

models, model_args = utils.load_ensemble_for_inference(model_paths, task)

File "/home/hdeng/nsu/fairseq/fairseq/utils.py", line 163, in load_ensemble_for_inference

model = task.build_model(state['args'])

File "/home/hdeng/nsu/fairseq/fairseq/tasks/fairseq_task.py", line 43, in build_model

return models.build_model(args, self)

File "/home/hdeng/nsu/fairseq/fairseq/models/init.py", line 25, in build_model

return ARCH_MODEL_REGISTRY[args.arch].build_model(args, task)

File "/home/hdeng/nsu/fairseq/fairseq/models/fconv_dualenc_gec_gatedaux.py", line 76, in build_model

encoder_embed_dict = utils.parse_embedding(args.encoder_embed_path)

File "/home/hdeng/nsu/fairseq/fairseq/utils.py", line 267, in parse_embedding

embed_dict[pieces[0]] = torch.Tensor([float(weight) for weight in pieces[1:]])

File "/home/hdeng/nsu/fairseq/fairseq/utils.py", line 267, in

embed_dict[pieces[0]] = torch.Tensor([float(weight) for weight in pieces[1:]])

From: Shamil Chollampatt notifications@github.com Sent: Monday, August 12, 2019 6:33 PM To: nusnlp/crosentgec crosentgec@noreply.github.com Cc: harrydeng8 harrydeng@gmail.com; Author author@noreply.github.com Subject: Re: [nusnlp/crosentgec] Environment setup for trained model (#1)

The dictionaries are downloaded by download.sh script. Specify path as the directory which contains the dictionaries.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nusnlp/crosentgec/issues/1?email_source=notifications&email_token=ALTSWYUEYI2HZVVPZG2QS4LQEIFNJA5CNFSM4IJEWK5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4EI3CY#issuecomment-520654219 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALTSWYTERDCERZVPH4ZEOYLQEIFNJANCNFSM4IJEWK5A .

pidugusundeep commented 4 years ago

I would like to know how do I create a new dictionary and do I need to retrain the models again or please explain. Upon running the code the results seem a bit off for my expected ones even with simple words.