KSP-KOS / KOS

Fully programmable autopilot mod for KSP. Originally By Nivekk
Other
697 stars 230 forks source link

Add An Associative Array! AAAAAAAA #963

Closed erendrake closed 8 years ago

erendrake commented 9 years ago

I have noticed that without user structures people are having a fun time passing lists around and having to worry about index.

So i have decided to make a new associative array structure it should have the following features by the time i am done.

because this gets to live in kOS.Safe there will be unit tests!

EDIT: i have been working on this for days and i just realized there was no issue :P EDIT2: WIP can be found here https://github.com/erendrake/KOS/tree/assoc_array. WARNING i will be rewriting the history on this branch so dont get too attached to it :)

abenkovskii commented 9 years ago

String keys are case insensitive

problem:

set s1 to "abc".
set s2 to "ABC".
print s1 = s2. // false
//but my_arr[s1] and my_arr[s2] are the same index

It might cause cold fusion. edit: confusion.

erendrake commented 9 years ago

@abenkovskii when we make string a user object, i hope to change the string equality behavior. my hope being that everything in kOS is case insensitive.

Edit: but i do agree that is will seem odd.

Dunbaratu commented 9 years ago

It is incorrect design to make it impossible for the user to pick case sensitivity vs insensitivity. If you build insensitivity in at too low a level, you make it impossible for the programmer to choose. The reason most languages don't do what you're talking about is that case insensitivity is a loss of information, which you don't want to enforce at too low of a level of the system.

Or to put it more clearly what the problem is: You CAN implement case-insensitivity on top of case sensitivity. You CANNOT implement case sensitivity on top of case-insensitivity.

Whether the kerboscript language is case sensitive is an entirely separate question from whether or not the values of its variables are.

Just because identifier foo is the same as identifier FOO does not mean that you have to (nor is it even desirable to) make it so that string content "foo" is equal to string content "FOO" permanently hardcoded into the low level guts of the system that a user cannot override.

Dunbaratu commented 9 years ago

The solution is to provide users with the option to do case insensitive vs sensitive checking, not to make it impossible to escape from it.

On another note, we never formalized what the meaning of kOS's case insensitivity is. There are places were we use CurrentCulture and other places where we don't. It would be a shame if a script that works when you run it in one country stops working when someone runs the same script in a different country.

erendrake commented 9 years ago

@Dunbaratu then we can add another case sensitive option later if we want it. it is frankly not interesting to me to have case sensitivity for equality checking.

SET FOO TO LEXICON().
FOO:ADD("bAr", 2).
print FOO["BAR"]. // 2
print FOO["bar"]. // 2
print FOO:KEYS. // [0] "bAr"
Dunbaratu commented 9 years ago

we can add another case sensitive option later

The fact that we can't is exactly the problem here.

erendrake commented 9 years ago

@Dunbaratu we do need to address the currentculture at some point. I am not sure how to form the requirements for it.

erendrake commented 9 years ago
The fact that we can't is exactly the problem here.

I dont see a problem.

Dunbaratu commented 9 years ago

You are designing the low layer on top of which everything else is implemented, and enforcing case-insensitivity in that low layer. Once the case information gets lost it cannot be re-inserted at a higher level layered on top.

abenkovskii commented 9 years ago

I agree with @Dunbaratu. For me:

print "bAr" = "BAR". // true

is nonsense. P.S. I think it's because in all other languages it's false.

erendrake commented 9 years ago

I am not loosing it. if you look at my example above the printout still has the original case. if we want one later on to replace the hash rules thats fine.

Dunbaratu commented 9 years ago

To make it clear, what raised my complaint is that you didn't want to merely make an option to make a case-insensitive key index for associative arrays, but then went on to say that you wanted to make ALL string comparisons, everywhere, insensitive.

Remember that user input and string manipulation is coming later. Doing it at the low level of ALL string comparisons makes it impossible to let a script differentiate 'a' from 'shift a' on the input stream.

abenkovskii commented 9 years ago

@erendrake I still don't understand why do you think that case insensitive array is better.

abenkovskii commented 9 years ago

"Explicit is better than implicit" (Zen of Python) I'm for the explicit approach: print "My sTrInG":TOLOWER().

erendrake commented 9 years ago

@abenkovskii because i dont there there is a use case for it

SET FOO TO LEXICON().
FOO:ADD("bAr", 2).
FOO:ADD("BaR", 5).
print FOO["BAR"]. // Error
print FOO["bar"]. // Error

I think that writing case sensitive code in this way is less readable.

Dunbaratu commented 9 years ago

Also, I find it poor to say "this collection is keyed by any object you feel like. Oh, but if that object is of type string, then it activates this special case behavior just for strings." That sort of decision belongs outside the primitive, at a higher level.

erendrake commented 9 years ago

@Dunbaratu i dont agree. if anything strings are the oddball by having this special quality that nothing else does.

Dunbaratu commented 9 years ago

Why can't you just do this?

   set foo to mapper().
   set foo:sensitive to false. // or to true depending on what you want

And thereby make it a property of the associative array itself how it treats its keys, RATHER than making it a low level feature of the language that it is incapable of ever differentiating "A" from "a", anywhere in the whole script.

Dunbaratu commented 9 years ago

Strings are not an oddball. You are making them BECOME an oddball with what you're proposing. Default behavior for any object used as a key is precise exact equality for ALL types objects, strings included, UNTIL you insert case-insensitivity that mashes different values together into being the same. The proper design for doing that is to layer that mashing on TOP of the low level system, not embed it inside the low level system where it cannot be escaped from.

erendrake commented 9 years ago

@Dunbaratu because i dont think there is a good reason to ever set it to true. I dont think case is a very useful tool. It is my opinion that sharing and maintaining code that depends on case sensitivity will be harder, so why have the option.

Also your assertion that it would make it harder to detect custom key presses is incorrect. We will already need the concept of a modifier key for alt and ctrl. Shift would just be another modifier.

Dunbaratu commented 9 years ago

Also your assertion that it would make it harder to detect custom key presses is incorrect. We will already need the concept of a modifier key for alt and ctrl. Shift would just be another modifier.

Strange, I thought you said you liked the telnet interface, but apparently you want to cripple it by making it only possible to read the difference between 'a' and 'A' when in the gui terminal.

abenkovskii commented 9 years ago

@erendrake How many case insensitive languages you know? Are there many languages with case insensitive strings? Edit: I guess not. Than my question is: why?

Dunbaratu commented 9 years ago

There are case-insensitive languages where the keywords are insensitive. NONE of them embed presumptions of insensitivity into their low level string comparisons in a way that cannot be escaped from.

And for very good reason.

Make it an option for associative arrays, heck even make it the default option when you don't set it - but don't make it the only way any string in the entire language will work.

erendrake commented 9 years ago

@abenkovskii i know kerboscript already has a lot of case insensitivity and it is the language we are designing ( right now i guess :) )

@Dunbaratu why would it be impossible for the telnet server to turn A into shift + a when using it for keybindings.

@Dunbaratu And for very good reason. what is it? what is the benefit of having it to a kerbal space program scripter?

erendrake commented 9 years ago

@abenkovskii i am also not talking about smashing user strings. preserving case while ignoring it in the context of equality checking is what i am talking about.

Dunbaratu commented 9 years ago

@erendrake, Your last two questions are the same. You cannot communicate non-chars via telnet. "shift" isn't a character until applied to another key. You can't read that the shift is down, all by itself, over telnet. You can only read that the letter sent was 'A' instead of 'a'.

For a use case: "Here's my keymapper object. It's an associative array that maps chars to activities that should happen when they're pressed. It does a different thing depending on what character it reads. If it reads 'A' it does foo, if it reads 'a' it does bar, and so on... oh crud kerboscript refuses to let me have them as two different entries in my associative array. Okay fine I'll have to make up my own primitive then from scratch - oh crud I can't even differentiate 'a' and 'A' in a boolean check of any kind, well crap. I can't do it then."

As to your claim that you're not talking about smashing user strings, yes you are. Read your own comment here: https://github.com/KSP-KOS/KOS/issues/963#issuecomment-99229581

erendrake commented 9 years ago

@Dunbaratu in that comment i am talking about equality for user strings.

SET FOO TO "fiZz".
PRINT FOO. // fiZz

is fine by me.

erendrake commented 9 years ago

@Dunbaratu when we receive an A over telnet. what would stop us from mapping it to a user mapping for shift + a? I dont see where we would be blocked in c# land from making that distinction for the user.

SET KEYBINDING:AG1 TO 'shift + a'.

could surely be mapped, are you saying that we will not be able to use other modifiers in bindings? thats too bad.

Dunbaratu commented 9 years ago

You are talking about making it utterly impossible for any script to detect the case of a letter. That effectively mashes strings. They only print the original case, you can't actually USE that original case in any meaningful way in the code. The user can see a difference that the code is completely blind to.

And that is purely wrong design.

Dunbaratu commented 9 years ago

I am in favor of having a case-insensitive string comparator. I can even accept having it be the default you get when you say nothing special, and having the case-sensitive one be the one you have to go out of your way to explicitly state you're using. But I can't get behind making it the ONLY way a kerboscript is allowed to be written, thus forcing your opinion on everyone else. I will not merge any code that does it this way.

erendrake commented 9 years ago

@Dunbaratu fair enough. how do you want to do this? on a collection by collection basis with a suffix? or with a config entry? I am open to either, it might be lame to specify it each time for string equality unless we add a new operator.

erendrake commented 9 years ago

@abenkovskii if we make it a config entry. do you see any reason that kslib would ever not work with one or the other?

abenkovskii commented 9 years ago

Will it be displayed in the kOS settings window if it's a config entry?

Dunbaratu commented 9 years ago

We can go with two string compare operators, perhaps '=' and '~"? (I'm a bit reluctant to use tilde because of its meaning for regex - it would be neat to support regex when we do string stuff later, and several languges use "~=" to mean regex replace-in-place.) Or maybe '=' vs '=case=', if you want insensitivity to be the default '=' and '=case=' to be the case-sensitive one?

Then just have a suffix for the assoc array object :sensitive set to true/false, and default it to false if you like.

On another note- is it really called 'Lexicon"? that seems sort of... string-centric. But it's not a big deal - just a strange word for it.

Dunbaratu commented 9 years ago

I'd rather it not be a config operator. Because the SAME script can encounter two different situations, one of which it's appropriate to be sensitive and the other it's not.

erendrake commented 9 years ago

@abenkovskii i assume so. that window is getting pretty cramped and we might end up pruning some of them here soon.

erendrake commented 9 years ago

@Dunbaratu yeah, i was thinking that a config would be lousy too. i am all on board for a suffix for the associative array and a new operator =case= is actually not too shabby

Dunbaratu commented 9 years ago

I changed my mind on the name :sensitive - it's too ambiguous what that is. how about :case. case = false means 'case does not matter'. case = true means "case matters".

erendrake commented 9 years ago

@Dunbaratu :casesensitive with a synonym of :case?

erendrake commented 9 years ago

alright, im going to change the OP.

Dunbaratu commented 9 years ago

casesensitive - sure. I just am a bit mindful of the verbosity of our suffixes - with case-insensitive identifiers it can be a bit annoying to run together several words like that.

erendrake commented 9 years ago

@Dunbaratu I agree, i wouldnt want casesensitive to be the only choice because it is verbose

Dunbaratu commented 9 years ago

I'm in a position of not seeing your code yet. Have you already made it so that "aaa" = "AAA" returns true, or does that 'rounding' only happen inside the keys?

abenkovskii commented 9 years ago

@Dunbaratu problem:

set case to 42.
print "ABC"=case="abc".

I propose: == or ===.

erendrake commented 9 years ago

@Dunbaratu right now i havent touched other strings. only the keys of this collection that i have so far been calling lexicon. But the name is up for grabs too if people dont like it.

Dunbaratu commented 9 years ago

Looking at tinypg, I think =case= will probably work as an operator. (I was worried it might parse as =, case(identifier name), = - as 3 tokens, but it looks like if it's faced with the chioce vs '=' and '=case=' at the same point, it will pick '=case=', provided you ran it together as all one word with no spaces.

Dunbaratu commented 9 years ago

The reason I asked about the = operator being in there yet or not is I was wondering about the feasibility of making it a separate PR to be done after this one - that looks feasible.

abenkovskii commented 9 years ago

I know that comparing booleans is a bad practice. But someone might do: "abc"=case=false where case is a string variable and expect it to be the same as ("abc" = case) = false.

erendrake commented 9 years ago

that is good to know, i also agree with you that using the ~ would be a bummer.

@abenkovskii === was my first thought too but i actually am pretty keen to see how some alternatives would look :)

erendrake commented 9 years ago

@abenkovskii and wouldnt print "ABC"=case="abc". be just as complicated as print "ABC"="abc". once you have tokenized it you should be good.