iljackb / Mixtepec_Mixtec

Mostly XML (TEI) markup of Mixtepec-Mixtec Language resources
3 stars 1 forks source link

Tag morphological units/inflections with <m>? #44

Open iljackb opened 5 years ago

iljackb commented 5 years ago

While a systematic use of <m> (with @xml:id) in addition to <w> in the encoding schema would enable a more specific and detailed representation of the content.

eg. nuu <w>nuu</w> is inflected as: nui <w>nu<m>i</m></w>

However, implementing this would create a significant problem in searching for exact strings as they would be split up between two (or potentially more) element segments.

My priority (early on) is to be able to achieve a corpus from which I can easily retrieve all instances of a given piece of data.

So is it possible to search for these strings without significant modification/burden?

What is the best approach?

laurentromary commented 5 years ago

The trick is to systematically introduce a flattening of <w> as a string before checking the content. Would you have a typical search expression so that we could see this concretely?

iljackb commented 5 years ago

the main querying script is the following XQuery which I use in BaseX (and which I didn't write).

[https://github.com/iljackb/Mixtepec_Mixtec/blob/master/stylesheets-scripts/xquery/query-mix-translations.xq]

laurentromary commented 5 years ago

Gasp… I nee to read the code to see what’s in there…

Le 7 nov. 2018 à 15:54, Jack Bowers notifications@github.com a écrit :

the main querying script is the following XQuery which I use in BaseX (and which I didn't write).

[https://github.com/iljackb/Mixtepec_Mixtec/blob/master/stylesheets-scripts/xquery/query-mix-translations.xq]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/iljackb/Mixtepec_Mixtec/issues/44#issuecomment-436649888, or mute the thread https://github.com/notifications/unsubscribe-auth/AE_Q72tlbxdEMEPx5cRpt_Umnf7XRStnks5usvRCgaJpZM4YSkAv.

Laurent Romary Inria, team ALMAnaCH laurent.romary@inria.fr