lipu-linku / pali-nimi

kulupu o alasa pona e sona nimi
4 stars 1 forks source link

formatting and meta discussion (formerly the toki issue) #1

Closed lipamanka closed 11 months ago

lipamanka commented 1 year ago

so I think that we need to illustrate a few things with the toki definition:

  1. toki is a form of communication
  2. toki doesn't imply or require an external recipient of the communication

so whatever examples we give need to be

  1. examples of forms of communication, in all parts of speech
  2. must include examples that imply a recipient, and examples that do not imply a recipient

noun examples: language, announcement, verb examples: discuss, proselytize, say, talk adjective examples: we might not need these because most of the ones I can think of in english are adjectivized nouns or verbs, like "talking" or "announcing" so it would end up being redundant examples that imply an external recipient: tell, vent, reprimand examples that imply no external recipient: think, ponder

Let me know if you have problems with this as a definition. I'm not incredibly happy with it but I think it's a good starting place to make a new definition. communicate, speak, discuss, tell, inform, think; language, announcement

mazziechai commented 1 year ago

This is a good starting point, I'll give some further thoughts when I'm available (i.e. home).

RetSamys commented 1 year ago

communicate, speak, discuss, tell, inform, think; language, announcement

Hm, "inform" and "announcement" are of course covered by toki, but they stand out to me. It's not quite the same as adding "curse" or "proselytize" or "read out loud", but they feel a bit more specific than the others?

It's not a huge difference, I think

lipamanka commented 1 year ago

communicate, speak, discuss, tell, inform, think; language, announcement

Hm, "inform" and "announcement" are of course covered by toki, but they stand out to me. It's not quite the same as adding "curse" or "proselytize" or "read out loud", but they feel a bit more specific than the others?

It's not a huge difference, I think

I do think the phrasing is off. This project is about specific word choice though, so we should choose either more general ones or rethink the design. What are things that are more general than "inform" and "announcement?"

tbodt commented 1 year ago

what does the semicolon mean

lipamanka commented 1 year ago

@tbodt All of the things to the left of the semicolon are verbs in english, and all the things to the right are nouns in english. in some cases I'll bring ambiguity tests to back semicolons but I think it makes sense to show these definitions as somewhat separated so at a glance learners can easily distinguish between what toki means as a verb vs as a noun. communicate, speak, discuss, tell, inform, think; language, announcement

ONE CHANGE we might be able to make here is to replace the verbs with gerunds (-ing words) because in english, those often function as verbs AND adjectives, which kind of lends itself well to this project imo. an example of what that might look like: communicate, tell, discuss, think; speaking, informing; language, announcement but at this point I fear the semicolon becomes a little bit lackluster. I can write up some guidelines for semicolon usage in this project maybe to help define that better, or we can just take it case by case

tbodt commented 1 year ago

But what are verbs and nouns. could not this have just as easily said "announce; communication, speech, discussion, information, thought". taso mi sitelen e toki ni li lukin ante e nimi language tawa pali la mi ken ala. "languaging.." ala , ni li ijo ala. LON la nimi language li ijo ante li wile lon poki ante.

nasin pu li pona pilin tawa mi. o tu lon kule nasin ala, o tu lon kule kon, anu seme!

Really a language is a nasin toki and this is an example of the little known variant on tenpo dropping called nasin dropping la pilin mi la ona li wile lon poki ante.

lipamanka commented 1 year ago

I'm gonna be honest jan Tepo I think you're reading too much into it. Our goal is to make a tool that's good for teaching and learning, and I think choosing english words that fit those functions best is the priority, not "but what IS a verb šŸ¤” šŸ¤”" y'know? I think nominalizing communicate to communication is a good idea though because broadly, all types of "toki" fit into that category as a noun, and toki is at its base a noun, right? at least lon derivation systems in pu? idk I left my copy of pu halfway across the country. But even having said all that I am open to getting rid of the semicolon here. I think the function of the semicolon in these definitions is sometimes weird and not well thought out, so one possible solution is to limit their usage to different meanings, not different parts of speech. because in "mi toki Inli" and "mi sona e toki Inli," even though toki means "speak" in the first one and "language" in the second or whatever, it's still the same meaning, it's just grammatically a different part of speech. So we could get rid of it! (I am tired, this one was a bit rambly and discoherent)

tbodt commented 1 year ago

Yeah that's my point; parts of speech are not well defined; thus not good criteria for semicolons

lipamanka commented 1 year ago

alright, i see what you mean and i agree. the problem is that all english words have a part of speech. so we still have to choose based on that. it's something to keep in mind, because English speakers will be looking at these from a naĆÆve perspective. they're not used to toki pona's way of dealing with parts of speech. so perhaps: get rid of the semicolon, but still group words that in english are the same parts of speech together, and make sure the english words fill out the semantic space as roundly as possible while still showing examples using every english part of speech we can, even if it's a gerund (-ing). how does that sound?

Daenyth commented 1 year ago

I think it's best to avoid having subtle rules about which punctuation to use. nasin o pona, o kepeken nasin lili taso - jan li sona ala e nasin suli tan lukin

gregdan3 commented 1 year ago

mi pana e sona nimi la mi pana e nimi la mi pana e nimi pi toki Inli lon poki, taso lili la ni

mi pana e nimi toki la mi kepeken nimi pali lon open, mi kepeken nimi ijo lon pini

taso, ni li suli lili; ken suli la jan sin li sona ala lukin e ante nimi li toki insa ala lon ni. o pana e ante nimi, taso nasin pana li suli lili tawa mi

(poka la mi toki pona li toki Inli lon poka. ona tu li sama lon sona la sina ken lukin e ona wile. tenpo la mi sitelen e toki pona lon open, tenpo la mi ni e toki Inli. mi toki e nasin mi, e nasin lipu ala.)


When I define words for people off the cuff, I tend to sort ish the parts of speech based on how the word feels

I define toki with verbs first, and nouns second

But ultimately I agree that they don't really matter; newbies probably don't know how to look for parts of speech, and don't think about them. Provide parts of speech, but the sorting is not super important to me.

(Also, I will be speaking both toki pona and English side by side. They will have the same info, so read the one you like. Sometimes I will write the toki pona first, and sometimes the English. This is my writing process tho, not how I'm putting it on the page.)

gregdan3 commented 1 year ago

mi wile alasa ni:

nimi li suli la ona o tawa pini. nimi li lili la ona o tawa open.

ni li ken pona tan ni:

nimi pi toki Inli li kama sama nimi pi toki pona lon ni: ona li lili. ni li lon ala tenpo ale li lon tenpo mute. sina ken kama pilin sama tan ni: sina kepeken ilo ni: https://splasho.com/upgoer5/

o lukin e sona ken pi nimi toki:

tell, speak, think, inform, discuss, language, communicate, announcement

taso ni li nasa e nasin nimi ni: nimi pali en nimi ijo o lon poka ante. ni la nimi language li ken nasa lon ni...

ni li ken ala ken:

tell, speak, think, inform, discuss, communicate, language, announcement

kin la, lili la mi wile pana e nimi ni tawa sona. ona ale li nimi ijo:

story, speech, conversation

ona li pona ala pona?


I'd like to try having the small words sorted to the front and the large words sorted to the back.

This could be good, because words in English tend to be more comparable to Toki Pona words when they are short. This is not an all the time thing, but it's often true. You could get the same vibe form this: https://splasho.com/upgoer5/

Have a look at this version of toki:

tell, speak, think, inform, discuss, language, communicate, announcement

But this mixes the verbs and nouns, where before they were separate, so the word language looks a little weird now.

Maybe this?

tell, speak, think, inform, discuss, communicate, language, announcement

Also, I kinda want to add these words to the definition. They're all nouns:

story, speech, conversation

How does that feel?

Daenyth commented 1 year ago

tell, speak, think, inform, discuss, language, communicate, announcement

nasin li pona mute tawa mi

lipamanka commented 1 year ago

I think that this method is interesting but i don't see the utility. we want the words next to each other to demonstrate some sort of through-line (writing about this in my len description), which i think we should prioritize over length.

gregdan3 commented 1 year ago

jan li kama sona la ona li kama lukin mute e sona lon open li lukin lili e sona lon pini. ni li nasin jan li lon tenpo ale a. ken suli la sina sona e pilin ni tan pali sina.

ni la nimi li lon nasin la nimi poka li suli ala; tenpo lukin li suli, anu seme?

ni la nimi lili open li pona

tell, speak, think, inform, discuss, language, communicate, announcement

tell, speak, story, think, inform, speech, discuss, language, communicate, announcement, conversation


When somebody is learning, they're going to see the front-loaded information much more often than the back-loaded information. This is essentially a fact of how people work, and you've probably experienced it yourself.

So, if the words are ordered a particular way, the words beside one another (the "through line") doesn't seem important; the time spent reading is important, yeah?

So, small words at the start helps this.

tell, speak, think, inform, discuss, language, communicate, announcement

tell, speak, story, think, inform, speech, discuss, language, communicate, announcement, conversation

gregdan3 commented 1 year ago

poka la sona nimi o suli seme, a a a


relatedly how large should definitions be, lmao

lipamanka commented 1 year ago

oh I meant the length of words, not the length of the definitions. but. hm interesting question. I think for the length our number one limitation is linku as a discord bot. otherwise I would argue the definitions should be paragraphs but that serves a different function. I think we can let them get a little long to show more examples. Like I like your second example.

But re: front-loaded information; I see what you're saying, but I don't think the length of the first few words is the important thing to be considering here. Perhaps the first word should be the english word that is the most vague. communication, tell, speak, think, inform, speech, discuss, language, conversation I think this one is better because it lists "communication" first, which as the front-loaded information is the most broad word I can think of for "toki," but as for the order of the following words I'm not sure.

Daenyth commented 1 year ago

I wouldn't worry too much about the order of words honestly. We can front-load the most important ones, probably, but overall it feels marginal to me as opposed to just being specific about which words we use

AcipenserSturio commented 1 year ago

Wait a second, could someone check the pu_verbatim entry for toki?

VERB to communicate, say, speak, say, talk, use language, think

There's no way it actually repeats say twice in the book?

gregdan3 commented 1 year ago

@AcipenserSturio image

Pulled from my ebook copy. I originally OCR'd it; it's correct.

lipamanka commented 1 year ago

okay re: order, I do hope I'm not dragging it into the ground but does anyone disagree with this general principle:

  1. Words that best explain the entire semantic space go first (so more general words)
  2. By the end, we list more specific examples
  3. It's a gradient in between

so the english words we use will slowly get less general and more specific.

OR we could rethink how we're doing definitions entirely, how's this: toki: a method or instance of communication, ex: talking, speaking, thinking, language not happy with this one at all yet but I think we can play around with phrasing these differently

AcipenserSturio commented 1 year ago

Here's a take that's fairly out of the box:

(+ e) say, (+e, lon) talk about, (+ tawa) converse with, (+ kepeken, lon, modifier) speak a language; speech, language; *a greeting*

lipamanka commented 1 year ago

OH that's an interesting idea! wiktionary does something similar. Sometimes at least. it lists examples under each of these and spreads them out. We could play around with this idea! but one problem I have is anglicization of the preps we use. Like in portuguese it's "falar com" not "falar a" for "speak to," as in like "toki lon" instead of "toki tawa." And imo "mi toki lon jan Kekan San" for "I speak with jan Kekan San" works! though a poka might make it more clear. So by not listing every single possibility we're contributing to grammaticalization.

Daenyth commented 1 year ago

I don't like all the details with + e and so on. I'd rather just have a list of words which are comparable parts of speech in english, to avoid information overload.

Or maybe we split the entries? Should we consider having two?

One shorter one that's a list of words which overlap the semantic zone and a longer one that dives into part of speech placement?

lipamanka commented 1 year ago

having multiple different types of entries seems like a lot more bang for our buck when it comes to the work we're doing here. we could have specialized options for learners looking for different things. But to answer your question @Daenyth having two definitions sounds like a SPLENDID idea

I think this thread serves ultimately to demo the structure of an entry so it's okay that we're taking our time discussing that in case anyone's like "who care's let's just finish the definition"

gregdan3 commented 1 year ago

Here's a take that's fairly out of the box:

(+ e) say, (+e, lon) talk about, (+ tawa) converse with, (+ kepeken, lon, modifier) speak a language; speech, language; *a greeting*

ni li pana ala e sona tawa mi, ni li seme?

mi sona pona la ni li ken ike tawa mi tan ni: jas sin la nasin sin li kama tan musi nimi tan nasa nimi. musi en nasa li ni: nimi li ken lon ale. sina pana mute e sona pi ma nimi la sina ken ike; ni li ken lili e sona ken lon toki pona


I have no idea how to interpret these annotations.

If I do understand, this can be harmful imo. Newbies rely on the openness of word placement to experiment and learn. Taking this away by codifying it into the definition makes the room for possibility much smaller.

Daenyth commented 1 year ago

I think the two-entry system makes a ton of sense, in the second one we can go into more detail.

ilo Linku and lipu Linku in non-detailed mode can present only the semantic zone words, and the detailed mode or single-word-page mode in lipu Linku can have the expanded stuff with examples from all kinds of parts of speech.

And I know this might be out of scope, but do we want to consider how we might present these in the toki pona taso translation of the dictionary? I really don't like that the tp dictionary isn't complete

gregdan3 commented 1 year ago

musi la mi pana e sona pi toki pona lon lipu nimi ale tan ni: mi wile e ni: kulupu ni li pona e sona nimi lon toki pona.


I actually added the tpt definition to my prepared word files because I was hoping this project would expand to include those definitions, aha.

AcipenserSturio commented 1 year ago

@Daenyth We already kind of have a basic / detailed thing, its called definitions vs commentary

This would sort of be a three part thing then, i suppose?

Daenyth commented 1 year ago

We already kind of have a basic / detailed thing, its called definitions vs commentary

I don't agree with that. For me the commentary section makes sense to talk about pragmatics or nuanced information like with pali or olin.

I guess you could call it a three part thing, perhaps. I'm not sure how useful that is - I'd also be happy enough to stick with the current "list of words"+commentary two-part system.

Anyway no matter what I do think that we should not heavily present part of speech stuff in the front

gregdan3 commented 1 year ago

if we genuinely do want to split the definitions into "basic" and "advanced" essentially, we either:

and both vastly increase the investment a translator must put forth.

That's not necessarily a bad thing either way, but we do have to decide where to draw the line.

lipamanka commented 1 year ago

wait so how does this idea sound: one definition lists words in the same way they're listed now, just with redesigned standardized policy about how we do that. The other definition serves as a translation of the toki pona definition, explaining the word in english with 1-3 sentences.

does this sound manageable? does this sound utilizable?

AcipenserSturio commented 1 year ago

@lipamanka how would the latter definition compare to your essays on https://lipamanka.gay/essays/dictionary ? Is it the same thing? Smaller? Different?

This is important context, considering you never got through all pu words

Daenyth commented 1 year ago

I like the sound of it but I want to see an example of what that looks like

lipamanka commented 1 year ago

@AcipenserSturio yeah why not! probably smaller, like ideally it'd be able to fit in linku. not sure how exactly, like we could have a new command like /nimi+ or something idk or /sonanimi idk that only displays that information. This would be AWESOME because

  1. linku becomes way more useful for teachers in teaching contexts
  2. these semantic spaces will be designed by a committee of ten people and not by lipamanka in their personal website
  3. sounds like more fun to me!
lipamanka commented 1 year ago

@Daenyth example: ilo tool, implement, machine, device The semantic space of ilo contains things that are used towards a goal. Itā€™s easy to say that everything can be used. Likewise if something is being used or can commonly be used, it is easy to call it an ilo. If I am using a hammer to hammer a nail into the wall, that is an ilo. If i am using a psychological method to calm myself down when iā€™m stressed, that can be an ilo as well. Without much context, ilo can refer to things that are commonly used as tools. With the context of it being used for something, though, anything can be an ilo.

this is just copied and pasted from my semantic spaces. ideally ours would be shorter, wouldn't start with "the semantic space of ilo contains," and I'd like more detail in ilo's shorter description (not too much more though)

tbodt commented 1 year ago

mi kama pilin e ijo. sona nimi lili la, We should start with the pu definitions or and change to solve problems instead of synthesizing something new. tan ni: like it or not, these are the reference / anchor point for community usage and other dictionaries. nasin nimi kulupu en lipu ante pi sona nimi li kama pu ala, taso ona li tan lipu pu. So starting there gets us immediately closer to the goal with less debate.

lipu suli poka la - mi kulupu o pali e ni lon tenpo sama ala. After we have definitions we can write exegeses.

AcipenserSturio commented 1 year ago

I agree. While I did bring up an alternate take for discussion (and I appreciated the thoughts that came out of it), most of the ideas given here seem worse (/opinion) than just taking pu and iterating on it

lipamanka commented 1 year ago

I get what you're saying but that would be copying another dictionary instead of doing the descriptive efforts ourselves. plus then we run into the problem of some of our definitions being just pu and others not being that. which is one of the things I think we're trying to fix. I think there should be a command available in linku that gives the pu definition verbatim but the linku definitions should be our descriptive work.

tbodt commented 1 year ago

Hm ok i think i'm starting to get why this project is feeling weird to me, because there are too many choices for what words and what order and i have no idea how to make them. Like there are like a thousand words in context that mean toki in context. Which ones are important and why? What do we actually want. we should write that down... brief and accurate ?

AcipenserSturio commented 1 year ago

@tbodt ku data (which we have not put in .md files!) might offer some insights into which words are the most relevant:

If you just look at the top definitions, it offers words that (I think) fit better than pu definitions:

pu

ADJECTIVE fire; cooking element, chemical reaction, heat source

ku data, top responses:

fireāµ, heatāµ, hotāµ, burningāµ, flameāµ

lipamanka commented 1 year ago

brief and accurate is a good design goal, but something I would like to see here is descriptive and current. because this is a lexicography project, we need to make new definitions based off of the current common use as best we can. the pu definitions are useful and relevant but largely not descriptive of how toki pona is used in 2023 because it was published nine years ago.

We can use pu as a base if that would make the process easier but I'd like to aim for every definition to be something new. No copying from any other source.

mazziechai commented 1 year ago

ku data (which we have not put in .md files!) might offer some insights into which words are the most relevant:

If you just look at the top definitions, it offers words that (I think) fit better than pu definitions:

pu

ADJECTIVE fire; cooking element, chemical reaction, heat source

ku data, top responses:

fireāµ, heatāµ, hotāµ, burningāµ, flameāµ

Can someone get around to adding this to the .md files? I'd like to but I'm basically on vacation and it's hard to get to use my PC

lipamanka commented 1 year ago

hm. I think using ku data to inform this project is flawed because it relies so heavily on ma pona usage. (this isn't a "no! we must not use it!" but just a thing to keep in mind. as lexicographers it's our responsibility to take into account which biases we're subject to)

AcipenserSturio commented 1 year ago

@lipamanka we also, iirc, don't have anyone on the committee whose tp experience doesn't mostly come from spending time on ma pona?

KelseyHigham commented 1 year ago

i don't really see the value in making definitions completely new

since other sources like nimi.li pull from Linku definitions, and all users are used to Linku being mostly pu, imo it's good for the definitions to be mostly old

i think a from-scratch descriptive dictionary would be cool, and i'd be happy to contribute to that. but i think it's easier for that kind of project to be the work of one person with peer review

KelseyHigham commented 1 year ago

my tp experience is mostly toki uta, lon sijelo and in VR

gregdan3 commented 1 year ago

Can someone get around to adding this to the .md files? I'd like to but I'm basically on vacation and it's hard to get to use my PC

mi ni

my tp experience is mostly toki uta, lon sijelo and in VR

kin! i have old tp experience from ma pona but most everything in the past ~8mo is from not ma pona

gregdan3 commented 1 year ago

but i think it's easier for that kind of project to be the work of one person with peer review

this statement is interesting

it makes me wonder if this project is equally open to such a process

if the goal is "slightly update all of the nimi pu definitions so that they better reflect usage but are still familiar", why does that require a large, many-person process as opposed to an individual with peer review

lipamanka commented 1 year ago

I spend a lot of time with IRL toki pona which is pretty cool imo, and I'm glad people in this project are some of the people who tp IRL the most. if we want to just tweak pu that's cool but out of principle I'd like to change at least one thing about each one.