CvGC / dict

9 stars 1 forks source link

GENERAL: tags and gismu families #10

Open uakci opened 7 years ago

uakci commented 7 years ago

We need to settle on a good set of semantic tags.

mezohe commented 7 years ago

.e'u tcitynde fi lo ka ma kau freime fa'o https://docs.google.com/spreadsheets/d/1yz1kMaHlC5DXH_qzQi0V1Osiv3JApzZHrGao850Rsf0/htmlview?pli=1#gid=0

solpahi commented 7 years ago

The reason or one of the reasons I wanted tags is that they offer ways for readers/users to navigate through semantic fields. That way you can find every related and similar concept without us needing to include a gazillion words in the See Also.

"badri" is an EMOTION (among other things)

Another grouping system is by place structure patterns:

"badri" has the pattern "EXPERIENCER experiences PROPERTY" or something like that, into which a lot of gismu fall.

Shall we keep these types of tags in one file or split them?

One of my original ideas was to identify all the "frames" in the gimste, then name them after one of their members like selma'o, and then use that as a tag, like a family name. For example, RINKA might be a family containing {rinka}, {krinu}, {mukti}, ..., and then each of those words would have a RINKA tag, and if you click RINKA (in a hypothetical online dictionary), you'd get all the words in that family displayed.

uakci commented 7 years ago

We should have 'general' frames (such as those from the file) and split them into finer groups that reflect the place structures.

On Aug 11, 2016 5:47 PM, "solpahi" notifications@github.com wrote:

The reason or one of the reasons I wanted tags is that they offer ways for readers/users to navigate through semantic fields https://en.wikipedia.org/wiki/Semantic_field. That way you can find every related and similar concept without us needing to include a gazillion words in the See Also.

"badri" is an EMOTION (among other things)

Another grouping system is by place structure patterns:

"badri" has the pattern "EXPERIENCER experiences PROPERTY" or something like that, into which a lot of gismu fall.

Shall we keep these types of tags in one file or split them?

One of my original ideas was to identify all the "frames" in the gimste, then name them after one of their members like selma'o, and then use that as a tag, like a family name. For example, RINKA might be a family containing {rinka}, {krinu}, {mukti}, ..., and then each of those words would have a RINKA tag, and if you click RINKA (in a hypothetical online dictionary), you'd get all the words in that family displayed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CvGC/dict/issues/10#issuecomment-239202478, or mute the thread https://github.com/notifications/unsubscribe-auth/AGo4dIiXpn9XWvMJEsM2DlmpgyUGxXnnks5qe0QUgaJpZM4JgMjn .

solpahi commented 7 years ago

I've started work on another categorization. This one is more fine-grained than the one by xorxes. I will need help to finish it. There are still around 500 gismu left unsorted.

https://docs.google.com/spreadsheets/d/1jWi9c5-nHnas6i4vq-aIlVB276UCx0FI7GR364cL9Ts/edit?usp=sharing

There are a lot of families, way too many. It's possible to merge some of them, but then uniformity is lost within the families and too many members will differ from the pattern of their family. (Adjusting the place structures to make a merge possible is another matter!)

I worked on this for the past three days and I'm rather tired of it. It is, however, an important step, because it will make the CvGC's job a heck of a lot easier.

uakci commented 7 years ago

This is a Very Good Idea(tm).

I propose that the tags file be a simple map, so that 'frame' could be one key, and, for example, places, to which semantic tags are added, could be other keys. As an example, taking «lebna»:

frame cpacu x1 agent x2 patient x3 source

Some other tags could be added, but for now, we need to settle on the tags mentioned here ('frame' requiring pamcolsolri's spreadsheet data, 'xN' requiring some sort of typing).

mklcp commented 7 years ago

The followings can be useful too.

Get rid of 'See also' is an extremely useful idea to more easily represent lojban's semantic fashions.
If it's sufficient for speaking about place's structure, not all relations are hierarchical, since some words can be related to different categories ⦅as if one want to have the same file in two or more directories on a computer⦆.
The problem of this is that, if the relation isn't specified, then anything can be linked to anything.