Open NikantVohra opened 11 years ago
@NikantVohra If you are referring to creation of transfer rules, I'm afraid some mistakes are inevitable, you'll often find that the current tagsets sometimes cannot infer the linguistic information you are looking for. These were some of the problems we faced.
Btw, sorry for being late, had some unexpected delays in return journey and then had to attend some busy work schedule at office.
El dg 16 de 06 de 2013 a les 23:10 -0700, en/na Abu Zaher va escriure:
@NikantVohra If you are referring to creation of transfer rules, I'm afraid some mistakes are inevitable, you'll often find that the current tagsets sometimes cannot infer the linguistic information you are looking for. These were some of the problems we faced.
* The english PoS tagger often gave confusing results, it was really a headache. I don't know whether there has been any improvement in the tagger by this time. Ask @ftyers for more details.
No, the English tagger is pretty awful still. But that doesn't matter so much as we're going to be doing Hindi->English
* Regarding this issue, Constraint Grammar would come pretty handy I think. @ftyers already added these things into project dependency chains, so you should have them at your disposal.
Yes, we should be using CG to disambiguate the Hindi. There are already a few rules. These can be expanded on.
* Some linguistic information was hard to re-create when doing Bengali to English, e.g. Bengali pronouns don't have gender, but English ones does. I don't think that would be in your case, as it's easy to infer gender from the verbs in Hindi.
For Hindi -> English the idea would be to set the gender as "to be
determined"
Fran
I am working on translation rules for the story. I just want to know how should I proceed with the same so that I should not make the same mistakes as you might have made when working on Bengali-English.Also can I make use of the rules of Bengali-English for the same?