LunarEngineer / MentalGymnastics

This is a school project with potential.
MIT License
1 stars 0 forks source link

Inserting composed actions #9

Open LunarEngineer opened 3 years ago

LunarEngineer commented 3 years ago

Our current thought process is as follows for dropping nodes into the experiment space.

  1. There are two types of nodes which can be inserted: Composed (i.e. a prior net that ran to completion and was saved to the Function Bank) and Intermediate (i.e. an Atomic Function that is added in an episode).
  2. When an episode is terminated it is automatically hooked to the Sink node in order to run the net.

Does point 2 above imply that the Composed Function carries the Sink with it?

If it is connected to the output and is dropped into an episode that would imply it terminates the episode immediately, which is not intended behavior. When @cizumigawa3 and I were discussing it last night we thought of two potential routes this could take.

  1. An exception is made, the Composed Function is allowed to connect to the output in a passthrough manner and only when nodes are dropped / snapped into the space and connected via radius is termination allowed. This implies the output connects to an arbitrary number of nodes in any episode.
  2. The Sink is not carried with the Composed Function and the Function is represented as the node which connected to the Sink without connecting.

What thoughts does everyone have as to advantages, disadvantages, and whatever else may be on your mind with respect to this?

hagopi1611 commented 3 years ago

I think #2 is the winner. The sink for a given dataset is always the same structure. For example, for classification tasks it's a Linear -> Softmax -> cross-entropy loss combination.

So when saving a composed task, we can just remove that output, which is always the same, and save everything up to that as the composed function. This also makes sense if we want to extend our process to handle multiple datasets (e.g., a mix of binary and multi-class classification tasks), because in that case, the size of that final linear layer will depend on the particular task, so we wouldn't want to save a particular linear->softmax size.