feature request: "printing" format that shows full comments/tags

ghost commented 10 years ago

Hello, I work with learner corpora (texts written by second language learners) and I want to manually annotate grammatical errors in a small corpus. Manual corpus annotation is time consuming, as everybody knows.

I am also language teacher, and as part of my job, I usually correct my students' essays using Microsoft Word: I highlight the wrong word/phrase and write a comment with the correction and so on. This could be considered like a "corpus annotation" task, since I am actually writing comments on fragments of texts. BUT despite a lot of work, this is not a corpus (I think there is no way to search in Microsoft Word's comments and get concordance results. If somebody knows, please let me know).

I was thinking that many language teachers spend a lot of time correcting students' texts and it could be useful if we could take advantage of this valuable time to create a corpus at the same time (getting the students' permission, and so on).

I think that Brat could be used this way, but an additional feature would be needed:

Is there a way to display the annotation tags fully? That is, now we can see a colored tag on the word, but we cannot see "Comments" unless we move the mouse over the word. I guess the reason it would make the space between the lines wider.

Could there be a feature (it could be only for printing, not for visualizing on the browser) by which we can see the full annotation (which is what the students want to see)? I am thinking about a visualization that looks like Microsoft Word's comments (the comments on the right side, connected to the word by a line), or Google Maps after printing a page (numbering), or the same visualization used in brat but with more space between the lines to put the comments, or any other kind of visualization that is meant to be used for manual inspection of the texts and annotations, in printed form.

With this feature, I could give my students their corrected texts with full tags/comments, and at the same time build a small learner corpus.

Learner corpus is recently becoming popular, and I think a tool like that would be very useful to "reuse" the valuable time language teachers spend correcting texts.

Thank you for your consideration.

Best, M. Pilar

jnferfer commented 10 years ago

Hello M. Pilar,

Besides NLP, I think that Brat is very useful to research on linguistic issues. Unfortunately, I am afraid that Brat isn't very well known among linguists.

I can't give you any technical answer, but I suggest you to create a normalised annotation scheme for error annotation. I wouldn't make many comments on the annotations. I would rather define attributes and would link them to the type of errors. Try to find patterns across your comments and normalise them somehow.

In any case, you can always export your annotated document and see the comments you made (the label is AnnotatorNotes). You can always relate these notes to the annotated spans, and export them somewhere (e.g. Excel) to better visualise it.

Best,

Juan Fernández

ghost commented 10 years ago

Hello Juan, thank you for your kind reply.

I can't give you any technical answer, but I suggest you to create a

normalised annotation scheme for error annotation. I wouldn't make many comments on the annotations. I would rather define attributes and would link them to the type of errors. Try to find patterns across your comments and normalise them somehow.

Yes, you are right, this is the idea. Try to apply an annotation scheme, so that the standardized tags can be searched and the results displayed in concordance form.

In any case, you can always export your annotated document and see the comments you made (the label is AnnotatorNotes). You can always relate these notes to the annotated spans, and export them somewhere (e.g. Excel) to better visualise it.

Yes, you are right, but I need to see that information in a more "friendly" way...

I wondered if it would be possible to visualize "at once" all the information added to all the words in the text: now you can see that information only word by word, when you mouse over it. I understand that the latter is the most practical for most usages, but a kind of "show all information" would be useful for learner corpora annotation (I can give the "full version" of the text to my students instead of a Microsoft Word file with comments) or any person who would like to inspect part of a corpus in printed form.

Best regards,

M. Pilar

nlplab / brat

feature request: "printing" format that shows full comments/tags #1054