paramitamirza / CATENA

41 stars 21 forks source link

Query regarding input to CATENA #1

Open svjan5 opened 6 years ago

svjan5 commented 6 years ago

Hi, I am facing some issues in getting my raw documents to the input format you have specified.

Some specific queries regarding the input format: |token | token-id | sentence-id | lemma | event-id | event-class | event-tense+aspect+polarity | timex-id | timex-type| timex-value | signal-id | causal-signal-id | pos-tag | chunk | lemma | pos-tag | dependencies | main-verb |

  1. Why you have asked to give same information twice like lemma, pos-tag.
  2. chunk meaning is not intuitive (its description is missing from the wiki)
  3. It is a bit confusing which all attributes are optional in the input.
  4. Is there any standard library to get raw documents in the required format?

I will be thankful to you for resolving these queries.

Thanks in advance Shikhar