OpenNMT / papers

8 stars 2 forks source link

unable to generate the vmap #2

Closed rkoystart closed 3 years ago

rkoystart commented 3 years ago

I have successfully generated the phrase_table.gz using the command

sudo docker run -v $(pwd):/root/corpus build-pt train en pt 3 > phrase-table.gz

Now with the generated phrase_table.gz i am trying the generate the vmap with the following command

python build-vmap.py -pt phrase-table.gz -ms 3 -mf 2 -km 20 -tv target_vocabulary -zg zg_list > vmap

I am getting the following error

Traceback (most recent call last):
  File "build-vmap.py", line 46, in <module>
    entries = line.split(" ||| ")
TypeError: a bytes-like object is required, not 'str'

And when i try to print the first few lines in phrase_table.gz

! ! ! ||| ! ! ! ||| 0.493828 0.817676 0.831885 0.738603 ||| 0-0 1-1 2-2 ||| 49420 29337 24405 ||| |||
! ! ! ||| ! ! ... ||| 0.00303306 0.409664 0.000340866 0.00243863 ||| 0-0 1-1 2-1 2-2 ||| 3297 29337 10 ||| |||
! ! ! ||| ! ! A ||| 0.0269231 0.817676 0.000238607 0.00054239 ||| 0-0 1-1 2-1 ||| 260 29337 7 ||| |||
! ! ! ||| ! ! Est ||| 0.333333 0.817676 3.40866e-05 2.98241e-05 ||| 0-0 1-1 2-1 ||| 3 29337 1 ||| |||
! ! ! ||| ! ! V ||| 0.00662252 0.408853 3.40866e-05 1.56883e-05 ||| 0-0 1-1 2-1 2-2 ||| 151 29337 1 ||| |||
! ! ! ||| ! ! ||| 0.0108069 0.817676 0.0318369 0.817098 ||| 0-0 1-0 2-1 ||| 86426 29337 934 ||| |||
! ! ! ||| ! ! _ ||| 0.000645578 0.409295 3.40866e-05 0.000720027 ||| 0-0 1-1 2-1 2-2 ||| 1549 29337 1 ||| |||
! ! ! ||| ! ! _Mais ||| 0.219512 0.410053 0.0015339 0.000926589 ||| 0-0 1-1 2-1 2-2 ||| 205 29337 45 ||| |||
! ! ! ||| ! ... ||| 0.000822614 0.409664 0.000374953 0.00269779 ||| 0-0 1-0 2-0 2-1 ||| 13372 29337 11 ||| |||
! ! ! ||| ! A do ||| 0.0107527 0.817676 0.00010226 1.71069e-07 ||| 0-0 1-0 2-0 ||| 279 29337 3 ||| |||

So can i know what i am missing or the mistake i have made ?

guillaumekln commented 3 years ago

I think the script still requires Python 2 at this time.

rkoystart commented 3 years ago

Ok , will try it in python 2.

rkoystart commented 3 years ago

Trying in python2 worked successfully. Thanks @guillaumekln