Closed lipamanka closed 11 months ago
This is a good starting point, I'll give some further thoughts when I'm available (i.e. home).
communicate, speak, discuss, tell, inform, think; language, announcement
Hm, "inform" and "announcement" are of course covered by toki, but they stand out to me. It's not quite the same as adding "curse" or "proselytize" or "read out loud", but they feel a bit more specific than the others?
It's not a huge difference, I think
communicate, speak, discuss, tell, inform, think; language, announcement
Hm, "inform" and "announcement" are of course covered by toki, but they stand out to me. It's not quite the same as adding "curse" or "proselytize" or "read out loud", but they feel a bit more specific than the others?
It's not a huge difference, I think
I do think the phrasing is off. This project is about specific word choice though, so we should choose either more general ones or rethink the design. What are things that are more general than "inform" and "announcement?"
what does the semicolon mean
@tbodt All of the things to the left of the semicolon are verbs in english, and all the things to the right are nouns in english. in some cases I'll bring ambiguity tests to back semicolons but I think it makes sense to show these definitions as somewhat separated so at a glance learners can easily distinguish between what toki means as a verb vs as a noun.
communicate, speak, discuss, tell, inform, think; language, announcement
ONE CHANGE we might be able to make here is to replace the verbs with gerunds (-ing words) because in english, those often function as verbs AND adjectives, which kind of lends itself well to this project imo. an example of what that might look like:
communicate, tell, discuss, think; speaking, informing; language, announcement
but at this point I fear the semicolon becomes a little bit lackluster. I can write up some guidelines for semicolon usage in this project maybe to help define that better, or we can just take it case by case
But what are verbs and nouns. could not this have just as easily said "announce; communication, speech, discussion, information, thought". taso mi sitelen e toki ni li lukin ante e nimi language tawa pali la mi ken ala. "languaging.." ala , ni li ijo ala. LON la nimi language li ijo ante li wile lon poki ante.
nasin pu li pona pilin tawa mi. o tu lon kule nasin ala, o tu lon kule kon, anu seme!
Really a language is a nasin toki and this is an example of the little known variant on tenpo dropping called nasin dropping la pilin mi la ona li wile lon poki ante.
I'm gonna be honest jan Tepo I think you're reading too much into it. Our goal is to make a tool that's good for teaching and learning, and I think choosing english words that fit those functions best is the priority, not "but what IS a verb š¤ š¤" y'know? I think nominalizing communicate to communication is a good idea though because broadly, all types of "toki" fit into that category as a noun, and toki is at its base a noun, right? at least lon derivation systems in pu? idk I left my copy of pu halfway across the country. But even having said all that I am open to getting rid of the semicolon here. I think the function of the semicolon in these definitions is sometimes weird and not well thought out, so one possible solution is to limit their usage to different meanings, not different parts of speech. because in "mi toki Inli" and "mi sona e toki Inli," even though toki means "speak" in the first one and "language" in the second or whatever, it's still the same meaning, it's just grammatically a different part of speech. So we could get rid of it! (I am tired, this one was a bit rambly and discoherent)
Yeah that's my point; parts of speech are not well defined; thus not good criteria for semicolons
alright, i see what you mean and i agree. the problem is that all english words have a part of speech. so we still have to choose based on that. it's something to keep in mind, because English speakers will be looking at these from a naĆÆve perspective. they're not used to toki pona's way of dealing with parts of speech. so perhaps: get rid of the semicolon, but still group words that in english are the same parts of speech together, and make sure the english words fill out the semantic space as roundly as possible while still showing examples using every english part of speech we can, even if it's a gerund (-ing). how does that sound?
I think it's best to avoid having subtle rules about which punctuation to use. nasin o pona, o kepeken nasin lili taso - jan li sona ala e nasin suli tan lukin
mi pana e sona nimi la mi pana e nimi la mi pana e nimi pi toki Inli lon poki, taso lili la ni
mi pana e nimi toki la mi kepeken nimi pali lon open, mi kepeken nimi ijo lon pini
taso, ni li suli lili; ken suli la jan sin li sona ala lukin e ante nimi li toki insa ala lon ni. o pana e ante nimi, taso nasin pana li suli lili tawa mi
(poka la mi toki pona li toki Inli lon poka. ona tu li sama lon sona la sina ken lukin e ona wile. tenpo la mi sitelen e toki pona lon open, tenpo la mi ni e toki Inli. mi toki e nasin mi, e nasin lipu ala.)
When I define words for people off the cuff, I tend to sort ish the parts of speech based on how the word feels
I define toki with verbs first, and nouns second
But ultimately I agree that they don't really matter; newbies probably don't know how to look for parts of speech, and don't think about them. Provide parts of speech, but the sorting is not super important to me.
(Also, I will be speaking both toki pona and English side by side. They will have the same info, so read the one you like. Sometimes I will write the toki pona first, and sometimes the English. This is my writing process tho, not how I'm putting it on the page.)
mi wile alasa ni:
nimi li suli la ona o tawa pini. nimi li lili la ona o tawa open.
ni li ken pona tan ni:
nimi pi toki Inli li kama sama nimi pi toki pona lon ni: ona li lili. ni li lon ala tenpo ale li lon tenpo mute. sina ken kama pilin sama tan ni: sina kepeken ilo ni: https://splasho.com/upgoer5/
o lukin e sona ken pi nimi toki:
tell, speak, think, inform, discuss, language, communicate, announcement
taso ni li nasa e nasin nimi ni: nimi pali en nimi ijo o lon poka ante. ni la nimi language li ken nasa lon ni...
ni li ken ala ken:
tell, speak, think, inform, discuss, communicate, language, announcement
kin la, lili la mi wile pana e nimi ni tawa sona. ona ale li nimi ijo:
story, speech, conversation
ona li pona ala pona?
I'd like to try having the small words sorted to the front and the large words sorted to the back.
This could be good, because words in English tend to be more comparable to Toki Pona words when they are short. This is not an all the time thing, but it's often true. You could get the same vibe form this: https://splasho.com/upgoer5/
Have a look at this version of toki:
tell, speak, think, inform, discuss, language, communicate, announcement
But this mixes the verbs and nouns, where before they were separate, so the word language looks a little weird now.
Maybe this?
tell, speak, think, inform, discuss, communicate, language, announcement
Also, I kinda want to add these words to the definition. They're all nouns:
story, speech, conversation
How does that feel?
tell, speak, think, inform, discuss, language, communicate, announcement
nasin li pona mute tawa mi
I think that this method is interesting but i don't see the utility. we want the words next to each other to demonstrate some sort of through-line (writing about this in my len description), which i think we should prioritize over length.
jan li kama sona la ona li kama lukin mute e sona lon open li lukin lili e sona lon pini. ni li nasin jan li lon tenpo ale a. ken suli la sina sona e pilin ni tan pali sina.
ni la nimi li lon nasin la nimi poka li suli ala; tenpo lukin li suli, anu seme?
ni la nimi lili open li pona
tell, speak, think, inform, discuss, language, communicate, announcement
tell, speak, story, think, inform, speech, discuss, language, communicate, announcement, conversation
When somebody is learning, they're going to see the front-loaded information much more often than the back-loaded information. This is essentially a fact of how people work, and you've probably experienced it yourself.
So, if the words are ordered a particular way, the words beside one another (the "through line") doesn't seem important; the time spent reading is important, yeah?
So, small words at the start helps this.
tell, speak, think, inform, discuss, language, communicate, announcement
tell, speak, story, think, inform, speech, discuss, language, communicate, announcement, conversation
poka la sona nimi o suli seme, a a a
relatedly how large should definitions be, lmao
oh I meant the length of words, not the length of the definitions. but. hm interesting question. I think for the length our number one limitation is linku as a discord bot. otherwise I would argue the definitions should be paragraphs but that serves a different function. I think we can let them get a little long to show more examples. Like I like your second example.
But re: front-loaded information; I see what you're saying, but I don't think the length of the first few words is the important thing to be considering here. Perhaps the first word should be the english word that is the most vague.
communication, tell, speak, think, inform, speech, discuss, language, conversation
I think this one is better because it lists "communication" first, which as the front-loaded information is the most broad word I can think of for "toki," but as for the order of the following words I'm not sure.
I wouldn't worry too much about the order of words honestly. We can front-load the most important ones, probably, but overall it feels marginal to me as opposed to just being specific about which words we use
Wait a second, could someone check the pu_verbatim entry for toki?
VERB to communicate, say, speak, say, talk, use language, think
There's no way it actually repeats say twice in the book?
@AcipenserSturio
Pulled from my ebook copy. I originally OCR'd it; it's correct.
okay re: order, I do hope I'm not dragging it into the ground but does anyone disagree with this general principle:
so the english words we use will slowly get less general and more specific.
OR we could rethink how we're doing definitions entirely, how's this:
toki: a method or instance of communication, ex: talking, speaking, thinking, language
not happy with this one at all yet but I think we can play around with phrasing these differently
Here's a take that's fairly out of the box:
(+ e) say, (+e, lon) talk about, (+ tawa) converse with, (+ kepeken, lon, modifier) speak a language; speech, language; *a greeting*
OH that's an interesting idea! wiktionary does something similar. Sometimes at least. it lists examples under each of these and spreads them out. We could play around with this idea! but one problem I have is anglicization of the preps we use. Like in portuguese it's "falar com" not "falar a" for "speak to," as in like "toki lon" instead of "toki tawa." And imo "mi toki lon jan Kekan San" for "I speak with jan Kekan San" works! though a poka might make it more clear. So by not listing every single possibility we're contributing to grammaticalization.
I don't like all the details with + e
and so on. I'd rather just have a list of words which are comparable parts of speech in english, to avoid information overload.
Or maybe we split the entries? Should we consider having two?
One shorter one that's a list of words which overlap the semantic zone and a longer one that dives into part of speech placement?
having multiple different types of entries seems like a lot more bang for our buck when it comes to the work we're doing here. we could have specialized options for learners looking for different things. But to answer your question @Daenyth having two definitions sounds like a SPLENDID idea
I think this thread serves ultimately to demo the structure of an entry so it's okay that we're taking our time discussing that in case anyone's like "who care's let's just finish the definition"
Here's a take that's fairly out of the box:
(+ e) say, (+e, lon) talk about, (+ tawa) converse with, (+ kepeken, lon, modifier) speak a language; speech, language; *a greeting*
ni li pana ala e sona tawa mi, ni li seme?
mi sona pona la ni li ken ike tawa mi tan ni: jas sin la nasin sin li kama tan musi nimi tan nasa nimi. musi en nasa li ni: nimi li ken lon ale. sina pana mute e sona pi ma nimi la sina ken ike; ni li ken lili e sona ken lon toki pona
I have no idea how to interpret these annotations.
If I do understand, this can be harmful imo. Newbies rely on the openness of word placement to experiment and learn. Taking this away by codifying it into the definition makes the room for possibility much smaller.
I think the two-entry system makes a ton of sense, in the second one we can go into more detail.
ilo Linku and lipu Linku in non-detailed mode can present only the semantic zone words, and the detailed mode or single-word-page mode in lipu Linku can have the expanded stuff with examples from all kinds of parts of speech.
And I know this might be out of scope, but do we want to consider how we might present these in the toki pona taso translation of the dictionary? I really don't like that the tp dictionary isn't complete
musi la mi pana e sona pi toki pona lon lipu nimi ale tan ni: mi wile e ni: kulupu ni li pona e sona nimi lon toki pona.
I actually added the tpt definition to my prepared word files because I was hoping this project would expand to include those definitions, aha.
@Daenyth We already kind of have a basic / detailed thing, its called definitions vs commentary
This would sort of be a three part thing then, i suppose?
We already kind of have a basic / detailed thing, its called definitions vs commentary
I don't agree with that. For me the commentary section makes sense to talk about pragmatics or nuanced information like with pali or olin.
I guess you could call it a three part thing, perhaps. I'm not sure how useful that is - I'd also be happy enough to stick with the current "list of words"+commentary two-part system.
Anyway no matter what I do think that we should not heavily present part of speech stuff in the front
if we genuinely do want to split the definitions into "basic" and "advanced" essentially, we either:
and both vastly increase the investment a translator must put forth.
That's not necessarily a bad thing either way, but we do have to decide where to draw the line.
wait so how does this idea sound: one definition lists words in the same way they're listed now, just with redesigned standardized policy about how we do that. The other definition serves as a translation of the toki pona definition, explaining the word in english with 1-3 sentences.
does this sound manageable? does this sound utilizable?
@lipamanka how would the latter definition compare to your essays on https://lipamanka.gay/essays/dictionary ? Is it the same thing? Smaller? Different?
This is important context, considering you never got through all pu words
I like the sound of it but I want to see an example of what that looks like
@AcipenserSturio yeah why not! probably smaller, like ideally it'd be able to fit in linku. not sure how exactly, like we could have a new command like /nimi+ or something idk or /sonanimi idk that only displays that information. This would be AWESOME because
@Daenyth example: ilo tool, implement, machine, device The semantic space of ilo contains things that are used towards a goal. Itās easy to say that everything can be used. Likewise if something is being used or can commonly be used, it is easy to call it an ilo. If I am using a hammer to hammer a nail into the wall, that is an ilo. If i am using a psychological method to calm myself down when iām stressed, that can be an ilo as well. Without much context, ilo can refer to things that are commonly used as tools. With the context of it being used for something, though, anything can be an ilo.
this is just copied and pasted from my semantic spaces. ideally ours would be shorter, wouldn't start with "the semantic space of ilo contains," and I'd like more detail in ilo's shorter description (not too much more though)
mi kama pilin e ijo. sona nimi lili la, We should start with the pu definitions or and change to solve problems instead of synthesizing something new. tan ni: like it or not, these are the reference / anchor point for community usage and other dictionaries. nasin nimi kulupu en lipu ante pi sona nimi li kama pu ala, taso ona li tan lipu pu. So starting there gets us immediately closer to the goal with less debate.
lipu suli poka la - mi kulupu o pali e ni lon tenpo sama ala. After we have definitions we can write exegeses.
I agree. While I did bring up an alternate take for discussion (and I appreciated the thoughts that came out of it), most of the ideas given here seem worse (/opinion) than just taking pu and iterating on it
I get what you're saying but that would be copying another dictionary instead of doing the descriptive efforts ourselves. plus then we run into the problem of some of our definitions being just pu and others not being that. which is one of the things I think we're trying to fix. I think there should be a command available in linku that gives the pu definition verbatim but the linku definitions should be our descriptive work.
Hm ok i think i'm starting to get why this project is feeling weird to me, because there are too many choices for what words and what order and i have no idea how to make them. Like there are like a thousand words in context that mean toki in context. Which ones are important and why? What do we actually want. we should write that down... brief and accurate ?
@tbodt ku data (which we have not put in .md files!) might offer some insights into which words are the most relevant:
If you just look at the top definitions, it offers words that (I think) fit better than pu definitions:
ADJECTIVE fire; cooking element, chemical reaction, heat source
fireāµ, heatāµ, hotāµ, burningāµ, flameāµ
brief and accurate is a good design goal, but something I would like to see here is descriptive and current. because this is a lexicography project, we need to make new definitions based off of the current common use as best we can. the pu definitions are useful and relevant but largely not descriptive of how toki pona is used in 2023 because it was published nine years ago.
We can use pu as a base if that would make the process easier but I'd like to aim for every definition to be something new. No copying from any other source.
ku data (which we have not put in .md files!) might offer some insights into which words are the most relevant:
If you just look at the top definitions, it offers words that (I think) fit better than pu definitions:
pu
ADJECTIVE fire; cooking element, chemical reaction, heat source
ku data, top responses:
fireāµ, heatāµ, hotāµ, burningāµ, flameāµ
Can someone get around to adding this to the .md files? I'd like to but I'm basically on vacation and it's hard to get to use my PC
hm. I think using ku data to inform this project is flawed because it relies so heavily on ma pona usage. (this isn't a "no! we must not use it!" but just a thing to keep in mind. as lexicographers it's our responsibility to take into account which biases we're subject to)
@lipamanka we also, iirc, don't have anyone on the committee whose tp experience doesn't mostly come from spending time on ma pona?
i don't really see the value in making definitions completely new
since other sources like nimi.li pull from Linku definitions, and all users are used to Linku being mostly pu, imo it's good for the definitions to be mostly old
i think a from-scratch descriptive dictionary would be cool, and i'd be happy to contribute to that. but i think it's easier for that kind of project to be the work of one person with peer review
my tp experience is mostly toki uta, lon sijelo and in VR
Can someone get around to adding this to the .md files? I'd like to but I'm basically on vacation and it's hard to get to use my PC
mi ni
my tp experience is mostly toki uta, lon sijelo and in VR
kin! i have old tp experience from ma pona but most everything in the past ~8mo is from not ma pona
but i think it's easier for that kind of project to be the work of one person with peer review
this statement is interesting
it makes me wonder if this project is equally open to such a process
if the goal is "slightly update all of the nimi pu definitions so that they better reflect usage but are still familiar", why does that require a large, many-person process as opposed to an individual with peer review
I spend a lot of time with IRL toki pona which is pretty cool imo, and I'm glad people in this project are some of the people who tp IRL the most. if we want to just tweak pu that's cool but out of principle I'd like to change at least one thing about each one.
so I think that we need to illustrate a few things with the toki definition:
so whatever examples we give need to be
noun examples: language, announcement, verb examples: discuss, proselytize, say, talk adjective examples: we might not need these because most of the ones I can think of in english are adjectivized nouns or verbs, like "talking" or "announcing" so it would end up being redundant examples that imply an external recipient: tell, vent, reprimand examples that imply no external recipient: think, ponder
Let me know if you have problems with this as a definition. I'm not incredibly happy with it but I think it's a good starting place to make a new definition. communicate, speak, discuss, tell, inform, think; language, announcement