verivital / vnn-comp

12 stars 8 forks source link

Category Proposal: General Nonlinear Activations (tanh, sigmoid, etc) #1

Open ttj opened 4 years ago

ttj commented 4 years ago

Networks composed of tanh, sigmoid, etc. and general nonlinear activations

Representative benchmark(s): control theory controllers

Questions: allow or disallow other nonlinearities as a sort of "combination theory" category (e.g., ReLU, piecewise linear, purely linear, etc.) or keep as a only consisting of tanh/sigmoid, etc.

alessiolomuscio commented 4 years ago

Activations: I think we need to pin down what activations we would like. tanh and sigmoid seems sufficient to me.

Benchmarks: we could have general classifiers too, including some with high dimensions.

I'd stick with tanh/sigmoid here.

Architectures: Do we need to fix the architecture?

ttj commented 4 years ago

Activations: I think we need to pin down what activations we would like. tanh and sigmoid seems sufficient to me.

Benchmarks: we could have general classifiers too, including some with high dimensions.

I'd stick with tanh/sigmoid here.

Architectures: Do we need to fix the architecture?

Thanks for the feedback, agreed, will work on fixing the activations, presumably just these two unless we get further feedback from others on any others to allow.

If you have some ideas for the larger benchmarks, please let us know what you're thinking (e.g., if you know of classifiers that can just work with only these).

Regarding architecture: yes, for the benchmarks, I'm imaging we will provide the explicit networks, so the architecture would in essence be fixed. Before that though, it is I believe fairly open based on the application.

souradeep-111 commented 4 years ago

Hi everyone, sorry for joining the party a bit late. :)
I am assuming general nonlinear activations would include ReLU's as well. Is that correct ? Which tools do we have in this category ?

ttj commented 4 years ago

Hi everyone, sorry for joining the party a bit late. :) I am assuming general nonlinear activations would include ReLU's as well. Is that correct ? Which tools do we have in this category ?

We're thinking networks with only ReLUs would be in the piecewise linear category, whereas this would cover nonlinear activations that are not piecewise linear, although if it makes sense with any benchmarks (please provide if you have any), we could consider a combination category.

We're awaiting feedback from tool authors on which categories they'll participate in, by commenting on these issues. We'll send an email to everyone who expressed interest soon to remind them.

souradeep-111 commented 4 years ago

This is to follow up on Taylor's request. I would like to sign up for this category with Sherlock. https://github.com/souradeep-111/sherlock

vtjeng commented 4 years ago

My MIPVerify tool won't be participating in this category, but I'd like to suggest that the benchmark that we select for this category should be one where networks that are piecewise-linear have been shown to not do well, better motivating why we would want to work with these networks (with general nonlinear activations).

GgnDpSngh commented 4 years ago

We would like to sign up for this category with ERAN https://github.com/eth-sri/eran

Following up on what Vincent just said, Tanh and Sigmoid are essential components of LSTM architectures, are we considering those?

Cheers, Gagandeep Singh

ttj commented 4 years ago

Following up on what Vincent just said, Tanh and Sigmoid are essential components of LSTM architectures, are we considering those?

For this category, the benchmarks I at least had in mind, were more so from control theory controllers (e.g. along the lines of the feedforward controllers in case studies from Verisig: https://github.com/Verisig/verisig ), although these aren't going to be that easily parameterizable. We're certainly open to other benchmarks though, so if you/anyone else have any in mind for this category, please let us know. One issue I would imagine with LSTM is that not many (if any?) methods support these layers directly, but if there's sufficient interest and support for them, we can certainly consider it. We earlier discussed an RNN category, but didn't see much interest in that so far, although again, we can reconsider.

ttj commented 4 years ago

General Nonlinear Category Participants:

ERAN NNV Sherlock

If anyone else plans to join this category, please add a comment soon, as the participants in this category need to decide on the benchmarks soon (by about May 15).

pat676 commented 4 years ago

Hi,

We would like to enter VeriNet https://vas.doc.ic.ac.uk/software/neural/.

The toolkit supports:

Regards, Patrick Henriksen

ttj commented 4 years ago

Finalized General Nonlinear Category Participants:

ERAN NNV Sherlock VeriNet

pat676 commented 4 years ago

Following up on previous comment, I think we should have some general classifiers here. I can train and release some fully-connected Sigmoid and Tanh networks with the MNIST dataset.

ttj commented 4 years ago

We have created some feedforward networks with tanh/sigmoid activations for classification on MNIST, and will share those soon. If there are any other proposals, please let us know.

ttj commented 4 years ago

Some ONNX MNIST classifiers with tanh/sigmoid are here, created by @Neelanjana314 , please let us know of any problems loading, etc., then we'll centralize things in this repository after we get the go-ahead/agreement on which of these to use:

https://github.com/Neelanjana314/VNN_COMP_2020/tree/master/Networks/General%20Non-Linear%20activation%20functions

pat676 commented 4 years ago

Some ONNX MNIST classifiers with tanh/sigmoid are here, created by @Neelanjana314 , please let us know of any problems loading, etc., then we'll centralize things in this repository after we get the go-ahead/agreement on which of these to use:

https://github.com/Neelanjana314/VNN_COMP_2020/tree/master/Networks/General%20Non-Linear%20activation%20functions

@Neelanjana314 have we decided on verification parameters (input images/ epsilons/ timeout) for these networks? Also, there are a total of 12 networks; depending on the number of input images and the timeout setting this may take a long time. Should we use all of them or choose a subset?

ttj commented 4 years ago

@Neelanjana314 have we decided on verification parameters (input images/ epsilons/ timeout) for these networks? Also, there are a total of 12 networks; depending on the number of input images and the timeout setting this may take a long time. Should we use all of them or choose a subset?

@Neelanjana314 can provide feedback on a subset of the networks to use (I'd suggest some various different sizes and the different activation types, maybe ~4-6 total). For the inputs/specifications, I would suggest using what's done for the MNIST examples in the other categories (probably same as in the ReLU/piecewise one) unless there are a priori known reasons to use something different.

Neelanjana314 commented 4 years ago

Some ONNX MNIST classifiers with tanh/sigmoid are here, created by @Neelanjana314 , please let us know of any problems loading, etc., then we'll centralize things in this repository after we get the go-ahead/agreement on which of these to use: https://github.com/Neelanjana314/VNN_COMP_2020/tree/master/Networks/General%20Non-Linear%20activation%20functions

@Neelanjana314 have we decided on verification parameters (input images/ epsilons/ timeout) for these networks? Also, there are a total of 12 networks; depending on the number of input images and the timeout setting this may take a long time. Should we use all of them or choose a subset?

I would suggest to use two networks from each of the activation type, X_200_100_50.onnx and X_200_50.onnx and we can as well use the same images as MNIST_ReLU(50 images provided by @pat676 or we can use the first 25). X_200_100_50.onnx should predict 49/50 or 25/25 where as the other should predict all of the images correctly. No normalization is needed for the input images.

For, the epsilon and timeout, I can update by tonight.

GgnDpSngh commented 4 years ago

Hi @Neelanjana314 , are these fully-connected networks or convolutional? I get "Conv" operations when translating?

Cheers, Gagandeep Singh

Neelanjana314 commented 4 years ago

Hi @Neelanjana314 , are these fully-connected networks or convolutional? I get "Conv" operations when translating?

Cheers, Gagandeep Singh

Hi @GgnDpSngh these are fully connected networks. Though, I will check and upload once again.

pat676 commented 4 years ago

Hi @Neelanjana314,

I'm getting this graph for the logsig_200_100_50_onnx.onnx network:

net

All layers seem to be convolutional. Also, we could skip the softmax at the end; limiting the number of nodes makes it somewhat easier to convert to PyTorch.

Neelanjana314 commented 4 years ago

Hi @Neelanjana314,

I'm getting this graph for the logsig_200_100_50_onnx.onnx network:

net

All layers seem to be convolutional. Also, we could skip the softmax at the end; limiting the number of nodes makes it somewhat easier to convert to PyTorch.

@pat676 This should be fully connected layer instead. Can you and @GgnDpSngh check with the new file? Let me know if it works in pytorch?

https://www.dropbox.com/s/kq4841shswb89fc/tansig_200_100_50_onnx.onnx?dl=0

Neelanjana314 commented 4 years ago

@Neelanjana314 can provide feedback on a subset of the networks to use (I'd suggest some various different sizes and the different activation types, maybe ~4-6 total). For the inputs/specifications, I would suggest using what's done for the MNIST examples in the other categories (probably same as in the ReLU/piecewise one) unless there are a priori known reasons to use something different.

We can go ahead with these specifications for the testing.

pat676 commented 4 years ago

Hi @Neelanjana314, I'm getting this graph for the logsig_200_100_50_onnx.onnx network: net All layers seem to be convolutional. Also, we could skip the softmax at the end; limiting the number of nodes makes it somewhat easier to convert to PyTorch.

@pat676 This should be fully connected layer instead. Can you and @GgnDpSngh check with the new file? Let me know if it works in pytorch?

https://www.dropbox.com/s/kq4841shswb89fc/tansig_200_100_50_onnx.onnx?dl=0

Hi @Neelanjana314, I'm still getting Conv layers with the dropbox file. How are you creating the onnx files? If you use PyTorch, you can enable verbose=true to get a print of all layers during conversion.

Neelanjana314 commented 4 years ago

Hi @Neelanjana314, I'm getting this graph for the logsig_200_100_50_onnx.onnx network: net All layers seem to be convolutional. Also, we could skip the softmax at the end; limiting the number of nodes makes it somewhat easier to convert to PyTorch.

@pat676 This should be fully connected layer instead. Can you and @GgnDpSngh check with the new file? Let me know if it works in pytorch? https://www.dropbox.com/s/kq4841shswb89fc/tansig_200_100_50_onnx.onnx?dl=0

Hi @Neelanjana314, I'm still getting Conv layers with the dropbox file. How are you creating the onnx files? If you use PyTorch, you can enable verbose=true to get a print of all layers during conversion.

So, we are converting a matlab file to an onnx file and the issue is in the conversion (matlab->onnx->pytorch). I think that onnx stores minimal operations to describe layers, and I guess they have a preference to have conv layers instead of FC layers. There are ways to make a conv layer behave like an equivalent fully connected layer but this might cause performance differences while checking the robustness.

We are looking into it, if we can change the conversion process (don't think we can change the onnx translator) and will update asap.

Or, you can try to change the conv layer to fc layer after the network is created.

pat676 commented 4 years ago

Hi @Neelanjana314, I'm getting this graph for the logsig_200_100_50_onnx.onnx network: net All layers seem to be convolutional. Also, we could skip the softmax at the end; limiting the number of nodes makes it somewhat easier to convert to PyTorch.

@pat676 This should be fully connected layer instead. Can you and @GgnDpSngh check with the new file? Let me know if it works in pytorch? https://www.dropbox.com/s/kq4841shswb89fc/tansig_200_100_50_onnx.onnx?dl=0

Hi @Neelanjana314, I'm still getting Conv layers with the dropbox file. How are you creating the onnx files? If you use PyTorch, you can enable verbose=true to get a print of all layers during conversion.

So, we are converting a matlab file to an onnx file and the issue is in the conversion (matlab->onnx->pytorch). I think that onnx stores minimal operations to describe layers, and I guess they have a preference to have conv layers instead of FC layers. There are ways to make a conv layer behave like an equivalent fully connected layer but this might cause performance differences while checking the robustness.

We are looking into it, if we can change the conversion process (don't think we can change the onnx translator) and will update asap.

Or, you can try to change the conv layer to fc layer after the network is created.

@Neelanjana314 I believe i have successfully converted to convolutional layers to fc now; however, the results are somewhat strange. Should the input values be in the range [0, 1] or [0,255]? Also, have you found reasonable epsilons and timeouts?

Neelanjana314 commented 4 years ago

@Neelanjana314 I believe i have successfully converted to convolutional layers to fc now; however, the results are somewhat strange. Should the input values be in the range [0, 1] or [0,255]? Also, have you found reasonable epsilons and timeouts? Hi @pat676 the range is [0,255] and the epsilon values i.e (0.02 and 0.05) will be translated to 5 and 12 respectively. Also for timeout ~15 mins should be fine.

pat676 commented 4 years ago

@Neelanjana314, sorry but I'm still getting some strange results and am unsure if my conversion from conv->fc succeded. Could you post your verification results for one of the network-epsilon combinations for me to use as a sanity check?

Neelanjana314 commented 4 years ago

@Neelanjana314, sorry but I'm still getting some strange results and am unsure if my conversion from conv->fc succeded. Could you post your verification results for one of the network-epsilon combinations for me to use as a sanity check?

HI @pat676 What about the classification results with zero epsilon values? Did you able to get the 25/25 predictions correct? However I will update the network-epsilon combinations asap.

Neelanjana314 commented 4 years ago

@Neelanjana314, sorry but I'm still getting some strange results and am unsure if my conversion from conv->fc succeded. Could you post your verification results for one of the network-epsilon combinations for me to use as a sanity check?

HI @pat676 What about the classification results with zero epsilon values? Did you able to get the 25/25 predictions correct? However I will update the network-epsilon combinations asap.

Hi @pat676 please find the final layer(after softmax) output for image1 and tansig_200_50.onnx as below: 0.000115960517742989 9.12453041114710e-06 3.89569815818540e-05 9.95785548319003e-05 1.07316381560933e-06 2.03678675334422e-06 7.14853401880313e-08 0.999422106024639 1.52424218336877e-05 0.000295849533050314

Also, the labels are in 1-10 range.

pat676 commented 4 years ago

@Neelanjana314, sorry but I'm still getting some strange results and am unsure if my conversion from conv->fc succeded. Could you post your verification results for one of the network-epsilon combinations for me to use as a sanity check?

HI @pat676 What about the classification results with zero epsilon values? Did you able to get the 25/25 predictions correct? However I will update the network-epsilon combinations asap.

Hi @pat676 please find the final layer(after softmax) output for image1 and tansig_200_50.onnx as below: 0.000115960517742989 9.12453041114710e-06 3.89569815818540e-05 9.95785548319003e-05 1.07316381560933e-06 2.03678675334422e-06 7.14853401880313e-08 0.999422106024639 1.52424218336877e-05 0.000295849533050314

Also, the labels are in 1-10 range.

Thanks, that helped. Everything works now.

Neelanjana314 commented 4 years ago

As the time is a constraint now, we can reduce the number of images to the 1st 16 instead of 25, that might save couple of hours.

pat676 commented 4 years ago

As the time is a constraint now, we can reduce the number of images to the 1st 16 instead of 25, that might save couple of hours.

I verified all 25 this round, but I can report only the first 16 if that's prefered. The problem seemed surprisingly difficult for such small networks, especially for eps=5. Maybe we should consider adding an easier (smaller) epsilon value too for next years competition?

Neelanjana314 commented 4 years ago

I verified all 25 this round, but I can report only the first 16 if that's prefered. The problem seemed surprisingly difficult for such small networks, especially for eps=5. Maybe we should consider adding an easier (smaller) epsilon value too for next years competition?

Hi @pat676 , you are right. I was about to mention that too(i.e to use epsilon 1 and 3 maybe) , but i guess everyone has already spent time on it, so, left it for this year. The previous idea was to match the specifications for the piecewiese-linear and non-linear cases.

pat676 commented 4 years ago

I verified all 25 this round, but I can report only the first 16 if that's prefered. The problem seemed surprisingly difficult for such small networks, especially for eps=5. Maybe we should consider adding an easier (smaller) epsilon value too for next years competition?

Hi @pat676 , you are right. I was about to mention that too(i.e to use epsilon 1 and 3 maybe) , but i guess everyone has already spent time on it, so, left it for this year. The previous idea was to match the specifications for the piecewiese-linear and non-linear cases.

Hi @Neelanjana314, I'm trying a run with epsilon=3 for logsig_200_50, and the results are more interesting. How about we say that if people have the time they can also run and report eps=3, if not that's no problem for this round?

Neelanjana314 commented 4 years ago

I verified all 25 this round, but I can report only the first 16 if that's prefered. The problem seemed surprisingly difficult for such small networks, especially for eps=5. Maybe we should consider adding an easier (smaller) epsilon value too for next years competition?

It will be fine if everyone agrees.

GgnDpSngh commented 4 years ago

Hi all,

So what eps are we using for the networks in the end?

Cheers,

Neelanjana314 commented 4 years ago

Hi all,

So what eps are we using for the networks in the end?

Cheers,

@GgnDpSngh I think we can go ahead with 5 and 12, as it was decided previously. Adding new epsilons may create confusion among others.

Thanks Neelanjana