Potential confusion due to version 5 change: renamed "whitespace" option in SPLIT to "word"

ToonTalk commented 5 years ago

I bet most people would not expect split by word to split

"Red, white, and blue."

as

But instead as

I.e. as "words".

jmoenig commented 5 years ago

this one's for @brianharvey ...

brianharvey commented 5 years ago

Geez, Ken, you're a Logo person! What's FIRST [RED, WHITE, AND BLUE]?

This behavior is the result of a compromise, so of course it doesn't make anyone happy. I wanted SENTENCE→LIST promoted from Tools to primitive, and Jens, who sometimes has fits of wanting to minimize the number of primitives, pointed out that SPLIT...WHITESPACE did the same thing. So I said it doesn't serve the purpose of emphasizing the language of words and sentences, and so Jens invented the compromise of SPLIT...WORD. And, yes, that's an improvement, I guess.

Honestly, the worst thing about a blocks language isn't enforcing the command/reporter distinction; it's that you can't just type in [red, white, and blue] and have it automatically be a data structure. You need the extra step of sentence→list. That really hurts the vision of text as a collection of words rather than a collection of characters. (And you need the extra step of list→sentence or of (combine (join words)) in the other direction. I want join words promoted to primitive, too.)

Really, in my heart of hearts, I want the conversion to be automatic. I want an input box that contains text that includes spaces to mean a list, and I want the print form of a list of words to be the join words of the list. But it's way too late for that; I should have been around at the very beginning of the design of Scratch to insist on it. Remind me to yell at Brian Silverman for not having done that! He's a Logo person, too.

We should really have variadic SENTENCE whose inputs can be words, lists of words, and strings of words with whitespace, and whose output is a list of words. Only it's a specially flagged list whose printform is a text string with spaces. Can I have that, @jmoenig? And then I can flush all the word/sentence library blocks with one form for words and one for sentences, and just have FIRST, LAST, BUTFIRST, and BUTLAST, like a real programming language. ;-)

If I dared, I'd ask for a red FIRST block that's another name for item 1 of, and move the latter down with the imperative array commands.

brianharvey commented 5 years ago

P.S. When I started that comment I was just ranting, or maybe hoping finally to convince Jens about sentence→list, but now that I've invented the idea of specially marked lists, I really like it! And at least SENTENCE as a primitive wouldn't be duplicating a Snap! one-liner; it'd be the only constructor for sentences. And I'd even like a flag setting "Logo mode" in which text strings with spaces automatically turn into sentence-lists. Pleeeeeeze?

jmoenig commented 5 years ago

Okay, now I regret renaming "whitespace" to "words".

Personally I'm not convinced that the "LOGO mode" of regarding any text with blanks in it as a "sentence", i.e. a list of words, is a good abstraction at all. I can see some benefit when that's the primary, or only thing you can do with the language aside from arithmetic. Otherwise it sounds like a big kludge to me, and I think Ken's example here is part of why I don't like it. Clearly punctuation is not part of a word. Also, some words are meant to be together even though there's a space between them and they could also stand by themselves in other contexts, such as "sea monster".

brianharvey commented 5 years ago

The reason spaces are treated as delimiters but commas etc. aren't is that punctuation carries meaning, so you don't want it to just quietly disappear. I suppose you could make a case for {red. ",", white, "," and, blue} but there are word operations to deal with punctuation. And, often enough the punctuation is part of the word, if, say, you're writing a program to justify a column of text. Or if the punctuation is an apostrophe. Also, if somebody puts three periods in a row, that means something, whereas three spaces in a row doesn't.

But arguing about punctuation misses the big point, which is that words and sentences are a rich area for kids to explore via programming. Take the example of the random sentence generator:

to noun output pick [boy girl computer turtle [sea monster] aardvark] end

to verb output pick [likes hates eats [jumps over] [plays with]] end

to sengen output sentence "the noun verb "the noun end

Note in passing the fact that looseness about the word and sentence data types allows your sea monster. This is really a pretty carefully designed microworld. Also note how the example finesses punctuation altogether. You can take this really far, with adjectives and adverbs and prepositional phrases and subordinate clauses, still ignoring punctuation, because that part's boring and finicky. Or you can write Eliza (csls v2) or Student (csls v3), if you're an older kid.

What makes this microworld feel like a kludge, I think, is that a text language deals with it more easily than a language in which lists are called out with a special, bulky notation. In Logo words and sentences just flow naturally.

So, may I invent sentences as a data type that's a list internally but whose printform is text? They would be generated only by the (new) SENTENCE primitive, so if you refrain from using that, you don't have to see them.

Then you can put "words" back to "whitespace" and we'll all be happy.

ToonTalk commented 5 years ago

I frequently use 'list' like this

What is annoying is when I need to edit it to insert or delete an element that I have to copy and paste values to generate the new list.

Split by whitespace is low-level description and split by word is the high-level intent.

I'm thinking about your SENTENCE proposal

brianharvey commented 5 years ago

Yeah, on my blue-sky list is to invent a good UI for "add a slot right here" in a variadic input. Kind of like the plus signs in the block editor, but invisible when not in use, and not popping up all the time to annoy people. But there are lots of things along those lines that would be nice, e.g., when you insert a new input in the block editor, all the invocations of the block have some input values moved over so the empty space is in the right place.

jmoenig commented 5 years ago

RE:

the big point, which is that words and sentences are a rich area for kids to explore via programming

Well, see, that's all nice and well for the English language, where grammar is expressed in prepositions rather than being reflected in cases. If you try even the most simple of your sentence generator examples in, say, German, everything fails rather miserably right away. Come to think of it, I'm pretty sure that this might be the very reason why LOGO was received with greater skepticism in Germany than in English speaking countries. Basically turtle geometry was the polyglot thing.

brianharvey commented 5 years ago

Hmm. I know Logo was widely used in Spanish-speaking countries, which are also inflected, but I'm not sure whether that was all turtles. Since we're working on this article now, I'll try to find out.

bromagosa commented 5 years ago

We did only turtles in my school when I was a kid.

ToonTalk commented 5 years ago

I learned Lisp in 1973 and Logo in 1974 and I recall not liking SENTENCE. It didn't have the clarity or precise control that LIST and APPEND gave me. I saw how it simplified many examples but it felt like DWIM (Do What I Mean) that us MacLisp people disliked about InterLisp.

ToonTalk commented 5 years ago

Brian's example of NOUN and VERB fails in English with plurals. And that can be a good learning experience. And the program can be changed to have SINGULAR-NOUN and PLURAL-NOUN, etc. In German or Spanish one can introduce DEFINITE-SINGULAR-NOUN and in some languages add MASCULINE- AND FEMININE-. And maybe some kids will implement morphology so these words are generated from basic components.

Many kids (and adults) don't know what is the purpose of categorising words into different parts of speech (other than getting grades on tests) -- building sentence generators can be a good experience to see their value.

brianharvey commented 5 years ago

Paul Goldenberg wrote a whole book about this stuff, mostly using English but referring also to other languages. As for precise control, that's a blessing and a curse. It lets you make improper lists (with something other than the empty list in the last cdr). Students' typical first attempt at reversing a list generates things like (((((e) . d) . c) . b) . a). You can't make that mistake with SENTENCE. Anyway, it's perfectly well-defined; it's just (flatten (apply append args)).

At a certain age and level of sophistication, the ability to make complex data structures is the right thing to teach. But for younger or less advanced users, SENTENCE lets people make lists of words painlessly. Sentences are sort of similar to Scratch lists, which can't have sublists.

One time I was visiting Cindy's middle school class when they were doing a simplified SENGEN. It turned out that their Spanish homework that week was conjugating verbs, and they just dived in to using the computer to do it for them!

jmoenig commented 5 years ago

categorization is great, I agree. But aside from masculine, feminine and neutral (!), and in addition to singular and plural, nouns in German also have different cases for subject (nominative, genitive) and object (accusative, sometimes even ablative, dative) use, as do verbs (1st person singular is wildly different from 2. person singular in German, not just 3. ps. as in English, and plurals also have different cases). Long story short: Sentence generators in languages like German have to be custom tailored to such specific, narrow uses that the point of clarifying grammar abstractions is lost in the nitty gritty details of what you must know before you can do it.

jmoenig commented 5 years ago

... which is why the really cool examples in German schools are also in (and about) English: https://twitter.com/aksi12uhr/status/1075428984052219906

ToonTalk commented 5 years ago

Regarding

it's just (flatten (apply append args))

args can be a list or a symbol so how can append be applied? And then why flatten the list?

brianharvey commented 5 years ago

@Ken: Oops yeah, (flatten (apply append (map (lambda (x) (if (pair? x) x (list x))) args). As for why flatten, because you're trying to make a sentence! :-P

@Jens: The point is, German kids (kids everywhere, in their native language) already know all those complicated rules. They speak in grammatical sentences before school age. They just don't know they know them, but if they set out to write a sentence generator, they'll discover that their program doesn't speak grammatical German, and they'll debug it, and in the process learn to articulate what they already know implicitly. I agree that it'll take them longer to make it work in German.

That example from Twitter must be from an object oriented programmer! :)

ToonTalk commented 3 years ago

I just reread this discussion and am now wondering about Japanese and Chinese where there are no spaces separating words.

But I'm ok with closing this.

jmoenig / Snap

Potential confusion due to version 5 change: renamed "whitespace" option in SPLIT to "word" #2435