Getting rid of all the SharedObject constructor parameters - one idea how.

Dunbaratu commented 9 years ago

The problem:

The need to pass SharedObject all over the place everywhere feels particularly annoying at times because it basically has to pass via a chain of function calls when object A constructs object B constructs object C etc.

We've known for a while it's annoying, and poor design to have to do this because really all SharedObject is actually doing is giving us a handle to the state of the pretend CPU which is actually running the user's code. These fields: Vessel, TermWindow, BindingManager, ProcessorManager, Part... they're all just ways to store "this is the current program's virtual execution environment".

So... can't we obtain that exact same information via reflection?

The presumption this is based on:

Any time ANY of this code is being run, be it a method suffix, or an Opcode's Execute, or whatever, if you trace the C# call stack backward enough, it should be possible to eventually track it back to some sort of method of the kOS.Execution.CPU instance that is running the user's code (most likely the Update() call that is going through a chunk of instructions of size CONFIG:IPU, but it might be other parts of CPU if it's doing a housekeeping thing like flybywire.)

Any time you're in a position to be using the SharedObject, that has to mean you're in a scenario where the code currently running must necessarily have come directly or indirectly from a method of the virtual CPU object on which the user's code is conceptually running. Otherwise SharedObjects.Part can't really refer to anything.

The refactor

Given all the above, one way to remove the need to pass SharedObjects around everywhere becomes apparent. We make a utility routine, say GetCPUFromExecutionStack() that does the walk of the C# stack trace to find which CPU object is running the code that called GetCPUFromExecutionStack(). The stack trace is a list of method call information, and the method call information allows you to query to find which object instance that method's this is referring to. So we can walk backward through the stack list until finding the first method who's 'this' is of type CPU, to find the CPU that is running this code.

Once you know the CPU that is running this code right now, that is the handle to tell you the shared object data you need - CPU has the SharedObjects instance in it.

Now, this walk would involve a sequential search of the call stack, but the call stack is a fairly short list of only tens of items at most (unless you're doing a very recursive algorithm). Even so, to keep things efficient we'd really only need to perform this walk during the constructor of an object. Once it had gotten the shared object, it could cache a reference to it locally like all the objects already do now.

This would just get rid of the need to PASS it as a parameter through all the constructors, and therefore alleviate the need to store a handle to SharedObjects when the ONLY purpose of that handle is JUST to allow SharedObjects to be passed in turn to other constructors. We have plenty of examples of cases where one of our classes requires SharedObjects in its constructor NOT because it ever has to use it itself for its own needs, but merely because it has to be able to pass it on to other class's constructors.

After we create constructors for all these objects which have removed the SharedObject argument, then whether or not there is still a need to sometimes also preserve the older variants of constructors that accepted SharedObject parameters explicitly will depend on whether we can find any cases where one of these objects is getting constructed from outside of CPU.Update(). It may be that the prudent thing is to instead of eliminating the SharedObject arguments in the constructors entirely, to just make the constructors' SharedObject arguments into optional arguments that default to null - then if the constructor sees that it's null, it can call the GetCPUFromExecutionStack() utility method to generate it.

abenkovskii commented 9 years ago

From Google Python Style Guide

Power Features

Definition: Python is an extremely flexible language and gives you many fancy features such as metaclasses, access to bytecode, on-the-fly compilation, dynamic inheritance, object reparenting, import hacks, reflection, modification of system internals, etc.

Pros: These are powerful language features. They can make your code more compact.

Cons: It's very tempting to use these "cool" features when they're not absolutely necessary. It's harder to read, understand, and debug code that's using unusual features underneath. It doesn't seem that way at first (to the original author), but when revisiting the code, it tends to be more difficult than code that is longer but is straightforward.

Decision: Avoid these features in your code.

Dunbaratu commented 9 years ago

Anyone who knows what reflection is, and has any experience at all, is already aware why it's a good idea to only use it when no other option presents itself. It's a bit condescending to post that cut-and-paste argument. It presumes I'm a newb who's never heard it, which is false. In this case the reason walking the execution stack makes sense is that the SharedObjects instance we're passing around is nothing more than the execution environment the current code execution is happening inside of. The stack trace tells us that in a 100% reliable way that no other method would be as reliable about.

The choices in front of us are "don't fix this", or "fix it with reflection". To make the reflection fix the wrong option, you have to first propose an alternate fix, or accept that it doesn't need fixing.

The context of this very old discussion is that @erendrake wanted to find a way to avoid passing around this generic "shared" reference everywhere to all the class constructors in kOS. After thinking it through for some time, this was the only solution that presented itself to me, so I wrote this explanation as a possible way to do it. I'm aware that using reflection should be minimal. but it shouldn't be "never". If "never" was the right answer, then it wouldn't ever exist in languages in the first place. Once upon a time languages didn't have reflection. It was added for a reason. It's a mess and it's hard to follow so it shouldn't be used when there's a better solution - but in this case that better solution was not easy to find. It seems like the only two answers to "it's bad to pass around the shared reference through multiple levels of classes to get it down to the deeper levels where it's used" are "don't try to fix it" or "fix it with reflection".

I'm perfectly willing to accept that "don't try to fix it" might be the right answer here. This entire issue discussion was in the context of if we do want to fix it, how can it be done.

erendrake commented 9 years ago

Reflection is the last refuge of the desperate and that is the way it should be :)

I think that today we have a lot of tools out there for dealing with this kind of problem and I have been noodling with some kind of IOC container approach. The biggest challenge is designing the container and the number of containers.

If you have a single container it makes passing the 'kernel' around easier because you can make it static.
if you go with a sub-container per core, you get to do away with the shared objects class altogether and have each class only request what it needs. This would be my choice but it requires that you know what core you belong to which might bring us right back to reflection :P

I am sure there is a clever way around this problem but i havent figured it out yet.

Dunbaratu commented 9 years ago

what core you belong to which might bring us right back to reflection

That's the beginning and the end of the entire reason for needing reflection in my original proposal above.

If there was a reliable way to get from "this constructor is being called right now" to "this is the CPU that's doing it" other than walking the callstack, it would be the way to do it.

The other place where the code is currently using a callstack walk is when dumping the error in the logger. It uses the exception's callstack to get from the C# method that threw the exception down to where within the user's own script code the opcode was that caused it to be run, so it can give the script programmer the location of the problem in terms that makes sense in the script without knowing kOS's C# code. I consider callstack walks to be the least offensive use of reflection (using it to walk the lists of members of a class, or the list of class derivations, leads to far worse code), because they are used when what you are trying to to really IS obtaining runtime information that really IS about the callstack. It's not being used as a roundabout way to find some other kind of data. It's using the callstack to find out about.... how the current code got called. It's looking at a thing where the problem space inherently IS about the callstack.

In this case it's "which instance of CPU class called me?"

hvacengi commented 9 years ago

Since none of the cores operate in parallel (you run the code on one core, then go to the next), wouldn't it actually be feasible to embed the information into a static field on kOSProcessor? Just change "kOSProcessor.CurrentProc" when the execution switches to the next processor in line. If the updates ever went multi-threaded this would pose a major issue, but for single threaded in line execution it would work cleanly enough. Anything that needs to pull "shared" data at the constructor (or any other point in execution) would be able to.

This would be predicated entirely on the idea that processors don't touch the references on other processors. I think that's true currently, but I could be wrong. And to be truly thread safe, there should be a lock placed at the beginning of each of the KSP/Unity events on kOSProcessor. Lock's come with a noticeable cost though (I don't know how expensive it is, maybe it's negligible to do between 0 and 10 per update cycle).

Alternatively, a static reference to the current processor index and then a static "GetProcessorAtIndex" method would be much more inherently thread safe, and support all of the same benefits with only a little more complicated code.

Dunbaratu commented 9 years ago

That might work - but it has to be ICPU not CPU, now that I think of it, else we can't run any kOS.Safe.Test programs.

As far as processors touching other processors, there is the abortive attempt at a "run on" command but it never got properly implemented, and it could be written in a message-passing way once it does get implemented.

hvacengi commented 9 years ago

Yeah, I expect that something like that will end up in some kind of a queue (too bad this doesn't run on .net 4.0, they implemented some really great concurrency safe collections in there). Even in that instance though, I would expect the execution of the run on command to be within the 2nd processor's fixed update call rather than the 1st.

Dunbaratu commented 9 years ago

To work properly, run on would, on processor A, describe the program and arguments and then send that in a message to processor B and continue on, without waiting for processor B. Then later Processor B would see that in its message queue and pick it up to run it. At least that's how I think it should work.

I'm just thinking that first would be to implement a message passing between CPu's in the first place (one that scripts can read from and do whatever they like with, and it would require connectivity if RT is running), and then run on would be implemented on top of that.

But this is getting off topic for this.

KSP-KOS / KOS