xigt / igt-edit

An online editor for IGT based on Xigt and INTENT
http://freki-edit.xigt.org/
MIT License
2 stars 0 forks source link

Project aims #5

Closed ar-jan closed 7 years ago

ar-jan commented 7 years ago

The Xigt projects look very interesting. I have a few questions:

  1. Is Yggdrasil suitable to create a Xigt-encoded corpus from scratch? Or are there other tools/formats you would recommend to start with instead, and import later if needed?
  2. Is there a demo, or perhaps screenshots, to get a better idea of what Yggdrasil is and does? Or is it easy enough to set it up by yourself?

Thanks!

goodmami commented 7 years ago

Hi @ar-jan, thanks for your interest! Yggrasil is the project name, although we often use the common name "Xigt Editor". Regarding your questions:

  1. Yggdrasil was created to fix up ODIN IGTs extracted from PDFs and to be part of that workflow (including enrichment via INTENT. If you don't mind relying on INTENT to infer structural relations and for adding additional data tiers, you could probably use it as it is to create a new corpus, but the experience wouldn't be ideal. We are currently discussing how we can import from Toolbox or other formats, as well as edit structural fields of an IGT. We have limited hours to work on these projects, but we're looking into how we can get others involved in development (if you're interested, let us know).

  2. This publication has some description and screenshots: https://www.aclweb.org/anthology/P/P16/P16-4006.pdf . We have a demo here: http://editor.xigt.org/user/demo. It should be possible to get it setup on another server, but the INTENT component probably has the most dependencies. Ask again for more details if you want to set it up on your own server.

Does this help?

ar-jan commented 7 years ago

Yes, thanks for the quick reply! The demo is very useful to see how the editor works. Is this just the editor, or is it coupled to INTENT? And does INTENT have any UI component, or is it command line / backend tooling only?

What I'm looking for is probably closer to a program like Fieldworks, except that that's not quite flexible enough for working with texts with very high orthographic variation (and ideally I want to work with parallel texts in the same interface as well). There's various XML encoding schemas which are expressive enough, but don't have user-friendly tooling around it. I often see OxygenXML recommended, but encoding the corpus would be extremely time consuming without the visual aid of IGT-like representation (and I'd still have to find other solutions for things like building/visualizing a lexicon and concordance searches).

I have a few more questions:

  1. I found a reference to http://depts.washington.edu/uwcl/xigt-edit/, is that still relevant?
  2. In the demo I see alignment between tiers depends on separation by spaces. Does (or could) a Xigt editor allow to explicitly define alignable segments which include spaces and even punctuation? (this is something Fieldworks doesn't handle).

I need to do some more reading of the Xigt-related papers and find out if these tools could help for my use-case. In any case, these look like very useful programs!

goodmami commented 7 years ago

INTENT is a command line / backend program and doesn't include a GUI. I think the Xigt Editor (Yggdrasil) is the only mouse-driven interface that calls INTENT currently.

I haven't used Fieldworks, but I've heard that it does some things nicely but doesn't entirely capture all the features of SIL Toolbox, like perhaps morphological parsing. If you're working on IGT data, a general purpose XML editor (like OxygenXML is, I think) won't be very convenient. Here are some packages capable of annotating IGTs: TypeCraft (web-based), ELAN, Toolbox, FLEx. This document is informative: http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/interoperability-of-language-documentation-tools-and-materials-for-local-communities.1.html

Regarding your other questions:

  1. Xigt-Edit was developed as a general purpose Xigt GUI editor. It had some nice features, including visualizations of syntax trees, but development has stalled. It's a Windows-only program, and most developers of Xigt run Linux or Mac, so it was hard to maintain after the original developer moved on. If you use Windows, you can try installing it, but we cannot provide support.
  2. The demo instantiates Xigt IGTs from visually-aligned (space-separated) examples. Spaces themselves don't hold special meaning in the Xigt-formatted IGT, and they can be used in contentful items. E.g., we give an example in a paper on Xigt where the text is "pick the book up", and the annotator wants to select "pick up" for a word-sense annotation (e.g. as opposed to "pick", "pick out", etc.). The annotation aligns to two non-contiguous tokens joined with a space. Xigt is very flexible in how it selects data for annotation, and this flexibility makes it very powerful, but also at times inefficient (e.g., it requires more processing to visually display Xigt's graph-shaped IGT than with strictly tree-shaped annotations).

For more information, the ODIN website has a list of relevant publications with links to PDFs.

Also, I'm interested in seeing how Xigt could serve your use-case, so do let me know if you try it out.

rgeorgi commented 7 years ago

For what it's worth, INTENT does have a minimal Web GUI at: http://intent.xigt.org/

It supports converting plain-text IGT instances into Xigt, and running the automatic enrichment tools on generated XIGT files. It's not much, but it may help generating Xigt documents for testing.

-- Ryan

Michael Wayne Goodman mailto:notifications@github.com January 17, 2017 at 12:30 PM

INTENT is a command line / backend program and doesn't include a GUI. I think the Xigt Editor (Yggdrasil) is the only mouse-driven interface that calls INTENT currently.

I haven't used Fieldworks, but I've heard that it does some things nicely but doesn't entirely capture all the features of SIL Toolbox, like perhaps morphological parsing. If you're working on IGT data, a general purpose XML editor (like OxygenXML is, I think) won't be very convenient. Here are some packages capable of annotating IGTs: TypeCraft https://typecraft.org/ (web-based), ELAN https://tla.mpi.nl/tools/tla-tools/elan/, Toolbox https://www.sil.org/resources/software_fonts/toolbox, FLEx http://fieldworks.sil.org/flex/. This document is informative: http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/interoperability-of-language-documentation-tools-and-materials-for-local-communities.1.html

Regarding your other questions:

  1. Xigt-Edit http://depts.washington.edu/uwcl/xigt-edit/ was developed as a general purpose Xigt GUI editor. It had some nice features, including visualizations of syntax trees, but development has stalled. It's a Windows-only program, and most developers of Xigt run Linux or Mac, so it was hard to maintain after the original developer moved on. If you use Windows, you can try installing it, but we cannot provide support.
  2. The demo http://editor.xigt.org/user/demo instantiates Xigt IGTs from visually-aligned (space-separated) examples. Spaces themselves don't hold special meaning in the Xigt-formatted IGT, and they can be used in contentful items. E.g., we give an example in a paper on Xigt http://dx.doi.org/10.1007/s10579-014-9276-1 where the text is "pick the book up", and the annotator wants to select "pick up" for a word-sense annotation (e.g. as opposed to "pick", "pick out", etc.). The annotation aligns to two non-contiguous tokens joined with a space. Xigt is very flexible in how it selects data for annotation, and this flexibility makes it very powerful, but also at times inefficient (e.g., it requires more processing to visually display Xigt's graph-shaped IGT than with strictly tree-shaped annotations).

For more information, the ODIN website http://depts.washington.edu/uwcl/odin/ has a list of relevant publications with links to PDFs.

Also, I'm interested in seeing how Xigt could serve your use-case, so do let me know if you try it out.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xigt/yggdrasil/issues/5#issuecomment-273256160, or mute the thread https://github.com/notifications/unsubscribe-auth/AAxL5h8f7EwgUUDf4ht3MrphnqL7nkQUks5rTQjUgaJpZM4LjuNi.

Arjan mailto:notifications@github.com January 14, 2017 at 1:47 PM

The Xigt projects look very interesting. I have a few questions:

  1. Is Yggdrasil suitable to create a Xigt-encoded corpus from scratch? Or are there other tools/formats you would recommend to start with instead, and import later if needed?
  2. Is there a demo, or perhaps screenshots, to get a better idea of what Yggdrasil is and does? Or is it easy enough to set it up by yourself?

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xigt/yggdrasil/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/AAxL5g4pliC767-m0BdFWbj6EQyan0_oks5rSSZUgaJpZM4LjuNi.

goodmami commented 7 years ago

Thanks, Ryan. I forgot about that one.

I also forgot to mention that Xigt-Edit may have been able to use the INTENT server, too, although I haven't tested if it works with the latest iteration of INTENT.

ar-jan commented 7 years ago

Thanks @goodmami and @rgeorgi, this is very helpful. I'll let you know if I use Xigt for my use case.