Closed jalving closed 2 years ago
In ONNX nodes represent one layer and so the input of all activation functions is the (vector) output of the previous layer.
An alternative approach to what you propose it to add the softmax input nodes id when building the network definition.
This issue is deprecated by PR #24. We now use a layer formulation that is consistent with ONNX. It should be straight-forward to support soft-max.
Currently, our
NetworkDefinition
assumes the activation function input is always a scalar. This abstraction does not capture activation functions that require the entire output of a layer such as softmax.One possibility is to add a mapping in
NetworkDefinition
to optionally map node indices to layers. Something akin to:{node_id -> [nodes in layer]}
Then the
build_full_space_formulation
could do something like the following where we pass in the (possibly empty) layer nodes to the activation functions:The
utils.py
functions would have to be updated to take extra arguments if we go this route.@fracek: Do you know how ONNX handles softmax? My understanding is that the dict-of-dicts is general enough to accomplish CNN, but here it struggles with softmax/normalization.