KSP-KOS / KOS

Fully programmable autopilot mod for KSP. Originally By Nivekk
Other
697 stars 230 forks source link

Discussion: what String functions we want to introduce into kOS. #636

Closed ZiwKerman closed 9 years ago

ZiwKerman commented 9 years ago

I've been tinkering with adding some String manipulation functions in my local build of kOS and following examples in the existing code was able to successfully implement a list very simple String functions such as

bool contains(String, SearchString)

bool StartsWith(String, SearchString)

bool EndsWith(String, SearchString)

List Split(String, SplitString).

They work like usual functions (min, round and etc), and they were rather easy to implement as they just mimick very basic string manipulation. Only Split function required some more code hunting, but I ended up using parts of code from buildlist function.

My questions are

If we settle on a certain limited set of functions, I can implement them, write the relevant docs and submit a PR.

Dunbaratu commented 9 years ago

naming convention: I think, as just a general rule, that every time our kerboscript function is just a one-liner wrapper around a built-in C# string method (which I suspect will be almost all the time), that we just name our function the same as the C# one it wraps around, and use the same parameter prototype signature as the C# one, to avoid confusing ourselves.

That means that I'd prefer your question to be answered as "make them as suffixes".

And THAT means that what we'd really want to do is make our own wrapper object around the C# string. (C# string, sadly, cannot be derived from, so we'd have to wrap an inner reference to string inside our class rather than just adding some kerboscript suffix initializers to C#'s string.)

That may seem a bit of extra work, but it also buys us a few other things - there's other places in the code where I'd have liked to be able to detect the difference between a string object on the stack being a kerboscript USER'S string in quotes versus it being a bareword identifier string. Making them separate classes makes that possible. At the moment, the kOS code does the sloppier (and dangerous) thing of assuming for example that any string starting with '$' must necessarily not be a user string and must be a kOS identifier. If a user happens to SET STR TO "$HELLO", then it will break kOS.

erendrake commented 9 years ago

I think @Dunbaratu has the right of it. making it a wrapper object very similar to what we did with the ListValue.

The biggest reason in my mind to do this is that we have a plan in motion to allow you to see what commands/properties are available within one of our structures. Once that is finished bare functions would be less discoverable.

ZiwKerman commented 9 years ago

I did not quite understand your concern on wraparound object. Consider the following example

    [Function("Split")]
    public class FunctionSplit : FunctionBase
    {
        public override void Execute(SharedObjects shared)
        {
            string split_string = GetString(shared.Cpu.PopValue());
            string search_string = GetString(shared.Cpu.PopValue());

            var list = new ListValue();

            foreach (string s in search_string.Split(split_string.ToCharArray()))
            {
                list.Add(s);
            }

            shared.Cpu.PushStack(list);
        }
    }

Do you really think it is worth the effort of introducing something like kOSString object for arguments like split_string?

Because you can easily adjust the GetString method from FunctionBase to handle necessary checks.

Dunbaratu commented 9 years ago

In general operating via methods of a class (which is what the suffixes are trying to emulate) is a better design because it's more portable to other language bindings. if someone is trying to make a Javascript or a C# compiler FOR kOS then all the stuff currently implemented in kerboscript's syntax has to be re-implemented for them. Whereas anything that is an object suffix they get "for free" by just generically mapping all object suffix references with the same generic code. I.e. as long as someone writing a Javascript mapping of kOS has made some generic way to map aaa.bbb(ccc) into aaa.getsuffix("bbb")(ccc), they've got all that functionality in their language binding too,

Anyway that's my reason for preferring doing things with suffixes over doing them outside the suffix system.

ZiwKerman commented 9 years ago

After giving it more thought I agree. Implementation would not be as simple as I thought after my 1 hour Frankenstein-style assembly of code pieces together to get simple string functions, but now I agree that your approach is much safer, futureproof and portable.

TDW89 commented 9 years ago

@ZiwKerman How does your split behave? Would Split(foo.ks,.). Give the list [foo.|ks] or [foo|.ks] or [foo|.|ks] or something else?

erendrake commented 9 years ago

unless @ZiwKerman is building his own. The C# string split would give [foo|ks]

ZiwKerman commented 9 years ago

It would work the same as the built in C# Split function, e.g for

set s to "foo.bar".
for (i in Split(s,"."))
{
  print i.
}

output should be

foo
bar

Though as you can see from the discussion, syntax would rather be

for (i in s:split("."))

Moreover for

set s to "foo.bar.js".
for (i in Split(s,"bar"))
{
  print i.
}

output should be

foo.
.js
Dunbaratu commented 9 years ago

As to the other question - what operations should we support?

triage: do the most versatile, open generic functions first

Obviously it would be nice to support ALL string functions but that might be a lot of typing. As a first target, maybe we pick the root functions that could be used to derive the other ones in simple one-liner kOS expressions without a lot of work, and get them done first.

For example, StartsWith can be done by a script on the back of the indexOf method, like so:

if str:startswith(substr) {
  // stuff
}

is the same thing as:

if str:indexof(substr) = 0 {
  // stuff
}

Therefore, implementing :indexof would be of a higher priority than implementing startswith. You "buy more" capability with your effort that way. Going back in filling in the methods that are just sightly more convenient aliases for the other ones can be done in a second pass.

literal escaping for magic characters, or a CHR() function?

One thing kOS has been missing for quite a while, which may be orthogonal to this, is the ability to protect-escape magic characters so you can stick them inside strings. RIght now you really have to jump through hoops to end up with a quote character inside a kosscript string, for example. (One user found a way- use the GUI to give a name tag to a part, in which the name tag is ", and then assign a string variable to that PART:TAG.)

One way to provide that ability might be to give users a number-to-character mapping, akin to doing: (char)34 in C#. (34 is ascii code for the dquote char.)

That might be done with a CHR() function.

which brings to the table - the static string methods

There are some methods of string in C# which are static methods. It might make sense to have a static "string from a char code" method (since we don't have a char class), but then we'd need an ability to have a binding identifier that returns a 'base instance' of string, a bit like how CONFIG returns the config instance.

ZiwKerman commented 9 years ago

I've written a ton of PHP and Javascript scripts in my time so literal escaping with \ is something I'm used to and for me it feels more natural than typing (char)34 or something like it.

On the matter of startswith versus indexof I tend to disagree - KOS programs are positioned as human-readable and slightly more intuitive to non-programmers and startswith seems more intuitive and noob-friedly to me.

Dunbaratu commented 9 years ago

True, but that would make it an issue not related to this discussion.

Escaping with "\" is a parser issue, not a string function.

I figure there's no good reason not to have both.

ZiwKerman commented 9 years ago

I think in order to outline the absolute minimum set of String functions we need to understand most frequent uses for them in kOS. For me now it is finding parts,modules, fields and events by partial names or other parameters

erendrake commented 9 years ago

I am actually more worried about going through kOS and making sure that users are always getting a string more than i am worried about implementing a few extra string functions.

If someone is willing to do that :) ill implement as many string functions as can think of.

Dunbaratu commented 9 years ago

The point is that when deciding what subset of the string methods to implement, a subset that accomplishes everything but is less pretty is more useful than picking a subset that is pretty but doesn't cover all the bases. StartsWith is quite literally covering a subset of the cases that indexof covers. Similarly, any sort of boolean "does it contain this substring" is a subset of the functionality of indexof. indexof tells you BOTH whether it exists or not, AND where it exists if it does.

Remember that the context of the original question was which methods to implement - a context that contains inside it the hardcoded assumption that you can't do them all. If that assumption is false, then the question itself is moot in the first place. If we can do them all, then the question isn't even worth spending any time on.

I'm just saying that IF the original assumption, implicit in the question in the first place, that we can't do them all, is true, then it would make sense to start with the ones that have the greatest range of uses.

ZiwKerman commented 9 years ago

So bottom line is that first we try to implement a wrapper for Strings, similar to one we have for Lists and then introduce standard String methods through that wrapper?

Dunbaratu commented 9 years ago

I think @erendrake's point is that there are a multitude of places in kOS where a user variable is given a string value. Therefore we'd have to ensure that if we make our own string wrapper (let's say UserString, for now, although maybe it will get a different name when actually implemented) that the kOS code NEVER hands a bare C# string back to the user anywhere. It would have to always be putting a UserString into the user's variables instead. And that means scanning through a lot of code by eye and making sure we catch all the cases where strings are given to the user script.

We may be able to make that job easier by forcing it at the stack level, similarly to how NaN and Infinity are caught. If you can't push it onto the stack, then it cannot ever get into a user variable.

erendrake commented 9 years ago

yes, i assume that this discussion was posted because @ZiwKerman thought that the most risk would be in the creation of the structure. I am just guessing that string is getting returned from a bunch of functions and structures and other places that will make it a pain in the butt.

To actually answer the original question. I would guess that the most important would be indexof, substring and split? i dont think there are any in the standard canon i would disagree with.

http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28string_functions%29

Dunbaratu commented 9 years ago

There's a back-of-the-mind major huge refactor idea I've been having for a while where I wanted to make it so that ALL user data on the stack or in a variable MUST be some sort of wrapper class that kOS is in control of, and never be an underlying C# class or primitive. For example, never an integer, but a KOSInteger instead, never a double, but a KOSDouble instead, and so on. The point of this is that it would open up the ability for us to have suffixes on numbers and the like, and so people could always have some base suffixes that all objects can use, like :TYPE, or :HELP.

This is making me think that making all strings be KOSStrings can be viewed as a first step toward that goal.

erendrake commented 9 years ago

@ZiwKerman and i were talking yesterday over on the forums and it all started with "can i get a handle on the executing part" and it only took a few steps to get from that to "man in the middle every primitive type". It underlines how interconnected and complicated this mod can be. Thanks everyone for the discussion :)

ZiwKerman commented 9 years ago

@erendrake Well, I still need that handle 8) Now it is not that critical since I've fixed boot.ks selection, but still, as I said - a bit of self-awareness for a script about a CPU it is running on would not hurt.

erendrake commented 9 years ago

@ZiwKerman i understand and completely agree that we should have it, i was just commenting on the chain of reqirements that lead us here.

A lot of the slopes are slippery in kOS :)

ZiwKerman commented 9 years ago

@erendrake Well, I agree that CORE: and messaging/CPU-interconnection has very little use without ability to parse messages, so getting basic string manipulation functions is almost a prerequisite. BTW is there somewhere a list of planned features or a roadmap for mod's development?

erendrake commented 9 years ago

Not really, it has been @Dunbaratu and I with a few one PR contributors for quite some time. Because of that we just work on the item that is most interesting to us at the moment.

@ZiwKerman we have a slack chat for some planning. send me your email address and ill invite you.

Dunbaratu commented 9 years ago

I was composing a reply but I'll stop. The discussion has deviated from strings to generic talk about EVERYTHING the mod does. Lets take that elsewhere so this issue can be left clean to talk just about strings.

erendrake commented 9 years ago

yeah, sorry that was my bad :)

ZiwKerman commented 9 years ago

Well, it is a discussion, which in my opinion has come to a conclusion, so I believe this issue can be closed.

@erendrake - see profile, I've changed settings for email to be displayed.

TDW89 commented 9 years ago

I know this has been closed a while but it seamed like the best place for this. If you are planning on adding string comparison can we have some thing that would work with if "a" < "b" {print True.}.?

erendrake commented 9 years ago

TDW89. please make a new issue for this. Unless you want this request to be a secret :-)

On Tue, Apr 14, 2015, 01:07 TDW89 notifications@github.com wrote:

I know this has been closed a while but it seamed like the best place for this. If you are planning on adding string comparison can we have some thing that would work with if "a" < "b" {print True.}.?

— Reply to this email directly or view it on GitHub https://github.com/KSP-KOS/KOS/issues/636#issuecomment-92662796.