ghpaetzold / questplusplus

Pipelined quality estimation.
49 stars 14 forks source link

Example GIZA Lex file #41

Open shamilcm opened 6 years ago

shamilcm commented 6 years ago

The third column of the example GIZA provided (lang_resources/giza/lex.e2s) has probabilities of English word given the aligned Spanish word ( i.e. p(en|es) ). Am I correct? Is this the right Giza file to be used with Quest++ for quality estimation of an en-es MT system?

ghpaetzold commented 6 years ago

Hi Shamil, that is correct.

Regards,


Gustavo Henrique Paetzold Research Associate in Text Adaptation University of Sheffield


De: Shamil Chollampatt notifications@github.com Enviado: terça-feira, 23 de janeiro de 2018 09:46:55 Para: ghpaetzold/questplusplus Cc: Subscribed Assunto: [ghpaetzold/questplusplus] Example GIZA Lex file (#41)

The third column of the example GIZA provided (lang_resources/giza/lex.e2s) has probabilities of English word given the aligned Spanish word ( i.e. p(en|es) ). Am I correct? Is this the right Giza file to be used with Quest++ for quality estimation of an en-es MT system?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/ghpaetzold/questplusplus/issues/41, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHrSd1b87JOaTjQq0SOHKGsC_UK7rtUIks5tNaqPgaJpZM4RpWzr.

shamilcm commented 6 years ago

Thanks. Just an additional clarification:

In the list of features given here (as well as previous shared tasks on QE): https://www.quest.dcs.shef.ac.uk/quest_files/features_blackbox_baseline_17

for features 7 and 8 which uses the GIZA++ probabilities, shouldn't the probability thresholds be p(s|t) and not p(t|s)? Since the target language is Spanish and source language is English, GIZA++ Lex file only has probability (en|es) and not porbability (es|en).

ghpaetzold commented 6 years ago

If I'm not totally misremembering it (sorry, long time not using GIZA++), the probabilities in GIZA files are p(t|s) already :)


Gustavo Henrique Paetzold Research Associate in Text Adaptation University of Sheffield


De: Shamil Chollampatt notifications@github.com Enviado: terça-feira, 23 de janeiro de 2018 10:14:50 Para: ghpaetzold/questplusplus Cc: Gustavo Henrique Paetzold; Comment Assunto: Re: [ghpaetzold/questplusplus] Example GIZA Lex file (#41)

Thanks. Just an additional clarification:

In the list of features given here (as well as previous shared tasks on QE): https://www.quest.dcs.shef.ac.uk/quest_files/features_blackbox_baseline_17

for features 7 and 8 which uses the GIZA++ probabilities, shouldn't the probability thresholds be p(s|t) and not p(t|s)? Since the target language is Spanish and source language is English, GIZA++ Lex file only has probability (en|es) and not porbability (es|en).

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/ghpaetzold/questplusplus/issues/41#issuecomment-359743703, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHrSdwA_z8F6ZkZiazsyMo3etjq4FVviks5tNbEagaJpZM4RpWzr.

Shireen35 commented 3 years ago

Can you please tell me how do we make a giza lex file please!!!

lspecia commented 3 years ago

Hello, unfortunately we are not maintaining this code anymore