Open LunarEngineer opened 3 years ago
I think #2 is the winner. The sink for a given dataset is always the same structure. For example, for classification tasks it's a Linear -> Softmax -> cross-entropy loss combination.
So when saving a composed task, we can just remove that output, which is always the same, and save everything up to that as the composed function. This also makes sense if we want to extend our process to handle multiple datasets (e.g., a mix of binary and multi-class classification tasks), because in that case, the size of that final linear layer will depend on the particular task, so we wouldn't want to save a particular linear->softmax size.
Our current thought process is as follows for dropping nodes into the experiment space.
Does point 2 above imply that the Composed Function carries the Sink with it?
If it is connected to the output and is dropped into an episode that would imply it terminates the episode immediately, which is not intended behavior. When @cizumigawa3 and I were discussing it last night we thought of two potential routes this could take.
What thoughts does everyone have as to advantages, disadvantages, and whatever else may be on your mind with respect to this?