stanleybak / vnncomp2023

Fourth edition of VNN COMP (2023)
16 stars 0 forks source link

Tool Submission Discussion #3

Open ChristopherBrix opened 1 year ago

ChristopherBrix commented 1 year ago

At this point, you should be updating your tool in order to support quickly verifying as many benchmarks as you can. Note that the benchmarks instances will change based on a new random seed for the final evaluation. We will follow a similar workflow to last year, where tool authors provide shell scripts to install their tool, prepare instances (convert models to a different format, for example), and then finally verify an instance. The detailed instructions for this are available at 2021's git repo.

You will be able to run and debug your toolkit on the submitted benchmarks online at this link. There, you first need to register. Your registration will be manually activated by the organizers, you'll receive a notification once that's done. Afterwards, you can login and start with your first submission.

The process is similar to the submission of benchmarks, with a small change compared to last year: You need to specify a public git URL and commit hash, as well as the location of a .yaml config file. There, you can specify parameters for your toolkit evaluation. By making those settings part of the repository, those will be preserved for future reference. You can define a post installation script to set up any licenses.

Once submitted, you're placed in a queue until the chosen AWS instance can be created, at which point your installation and evaluation scripts will be run. You'll see the output of each step and can abort the evaluation early in case there are any issues. Once a submission has terminated, you can use it to populate the submission form for the next iteration, so you don't have to retype everything.

Important: We currently have no limitation on how often you can submit your tool for testing purposes, but will monitor the usage closely and may impose limits if necessary. Please be mindful of the costs (approx. 3$ per hour) each submission incurs. To save costs, you should debug your code locally and then use the website to confirm the results match your expectations.

We strongly encourage tool participants to at least register and have some test submissions on the toolkit website well ahead of the deadline.

ttj commented 1 year ago

@ChristopherBrix we currently are trying to get our NNV/Matlab things set up to execute on this. The last time we did this, I set everything up, but it was before the infrastructure used last year, and I had to manually do quite a few things. So, @mldiego is leading this, and we may need to do a few things to make this work due to how Matlab currently can be set up for execution on AWS (either a custom AMI based on reference architecture from e.g. here https://github.com/mathworks-ref-arch/matlab-on-aws#deployment-steps , possibly running inside Docker [maybe easiest], possibly needing to manually install/configure some things, or something else, etc.). So, wanted to make you aware as we likely will need your help for the execution system to make this work.

First, what AWS region specifically? The website says Amazon (Oregon region), which is this specifically (us-west-1 or what ?)?

Further, we may have some questions on how to do things with the startup / batch execution scripts, as currently sorting out whether we want to keep things running in between or not, given that if inside Docker, this may take some time to start up, and further, may take some time for Matlab to start up. We likely also may have some questions on the model conversion, as unfortunately Matlab's ONNX support is rather poor right now, and need to see whether preprocessing of models necessary or not to get them into Matlab

ChristopherBrix commented 1 year ago

I'm using the us-west-2 region.

Let me know if you have any questions I can help you with!

ChristopherBrix commented 1 year ago

@ALL: Please give the online submission tool a try as soon as possible! Make sure your tool supports the automatic setup on AWS. If you need any assistance, I'm happy to help.

You can submit as often as you'd like, so you can debug your setup even while you're still working on some benchmarks.

mldiego commented 1 year ago

@ChristopherBrix @ttj Is there a way to have any AWS instances with a predefined mac-address? Something like using ENI to fix the mac-address, and then use that one for NNV? (see: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html) If we can get that, we may not need a matlab-aws target architecture, as we can set up the licensing (MATLAB) to that specific MAC-address, and then installing everything via install_tool.sh

Otherwise, we may need help setting up one of the matlab-on-aws instances (R2022b), then set the licensing there manually and installing support packages. Once that is set, we can install NNV and MATLAB requirements for NNV with the install_tool.sh script (Should be very similar to what @ttj prepared for the 2021 submission: https://github.com/mldiego/nnv/blob/master/code/nnv/examples/Submission/VNN_COMP2021/README-AWS.md)

Docker is not ideal (competition infrastructure, overhead...), but for licensing may be the best/easiest for us. If we take this route, it would be a little easier for us to be able to run Docker only once (I understand execution scripts may need to change slightly for this) to avoid starting a container and Matlab for each instance. Essentially, we could set up the install_tool.sh to automatically download Docker, build the image, do some hacky things for support packages (ONNX importer), then copy the benchmark files from the instance (local) to the container after the properties have been generated with the random seed, then call the scripts prepare_instance.sh and run_instance.sh similar to as if we were running it locally (outside Docker).

Please, let me know if either of the options can be considered and added to the automated framework. I'll be happy to help with any of this as well @ChristopherBrix

aliabigdeli commented 1 year ago

I don't know if any other team submitted their codes or not, but I couldn't run my code on the VNNCOMP website. Although I install the Conda and create an environment and use python path of the environment to run the toolkit in run_instance.sh, it couldn't find the module installed on the environment, which is strange; because I don't face this issue on local machine and also some other cloud platform like cloudLab. Does anybody successfully submit and run their code?

ChristopherBrix commented 1 year ago

@mldiego To clarify, if I could support ENI, then you could have the same MAC address all the time, and thus your licensing issues would be solved?

For Gurobi, the current process is like this:

  1. Tools install everything they need, including Gurobi using the install_tool.sh script
  2. As the last step, they print some information about the AWS instance that's needed to create a license
  3. In a manual step, a license file is generated. The content can be copied into the "post_install_script.sh" via the website so it'll be created on the AWS instance
  4. After this manual step, the setup is done and the evaluation can begin

For you, licensing is more involved? Or could a similar process work? Note, that this requires Gurobi users to get a new license file each time they submit their tool. However, Gurobi licenses are free for academic use, so that's not an issue. Is it different for Matlab?

@aliabigdeli I will look into this.

ChristopherBrix commented 1 year ago

@mldiego I should be able to support ENI by tomorrow. I'll ping you then.

@aliabigdeli Please try again with a larger (eg. m5) instance type. Your machine ran out of RAM during the setup (there's an error code in the logs if you scroll up a bit). It tested it with m5, there it seemed to work.

mldiego commented 1 year ago

@ChristopherBrix That should be enough. Once we have that MAC-address, I will create the license, as well as the installer_input.txt and activation key, all necessary to install MATLAB "Offline" or noninteractively, include all of that within the install_tool.sh, and we should be good to go, although it would take a while to install everything.

https://www.mathworks.com/help/install/ug/install-noninteractively-silent-installation.html

About the current process for Gurobi licensing: Something similar may work for us too. As far as I understand, Matlab's licenses can only be downloaded from their website (to specify target OS, username, and MAC-address), so we'll have to do that manually every time to get the license, then we could do a similar process as explained earlier (noninteractive installation).

It would also work if we get an AWS instance with MATLAB already installed (matlab-on-aws, then follow the steps listed there for licensing and setup, and then we should be able to simplify the installation script (only worry about NNV installation, much faster).

ChristopherBrix commented 1 year ago

I'm on mobile so I cannot check right now, but isn't that a description for a desktop client?

If there's an ami that includes MATLAB support, that would be trivial to support.

mldiego commented 1 year ago

I just assumed we could do the same non-interactive installation within the aws instance as long as no GUI is needed.

There is some information here about creating an AMI with MATLAB support, but I am not very familiar with it, so not sure how helpful this will be: https://www.mathworks.com/help/cloudcenter/ug/create-and-discover-clusters.html https://www.mathworks.com/help/cloudcenter/ug/create-a-custom-amazon-machine-image-ami.html

aliabigdeli commented 1 year ago

@mldiego I should be able to support ENI by tomorrow. I'll ping you then.

@aliabigdeli Please try again with a larger (eg. m5) instance type. Your machine ran out of RAM during the setup (there's an error code in the logs if you scroll up a bit). It tested it with m5, there it seemed to work.

It works now. Thanks.

ChristopherBrix commented 1 year ago

@mldiego I can now assign a running instance an ENI. However, I'm not sure how that helps. Does the ENI need to be associated with a public IP?

Do you have access to a AWS account? If so, could you try setting up the instance the way you need it, and let me know the steps to reproduce? So what needs to be done with the ENI to support your usecase? Then I can add that more easily.

mldiego commented 1 year ago

@ChristopherBrix

What about the AMI with MATLAB support? Would that be easier to add since all the other ones are set this way too?

ChristopherBrix commented 1 year ago

That would be much simpler. If you find a suitable ami, let me know!

mldiego commented 1 year ago

Thanks! Working on it. I was following the instructions here: https://www.mathworks.com/help/cloudcenter/ug/create-a-custom-amazon-machine-image-ami.html, but my license is not authorized to create clusters (MATLAB parallel server), so I am trying to find another way.

I'll take a closer look at the ENI method. My understanding is that we would need a fixed MAC-address (set licensing once and save it for next time we run NNV if possible?), and then we should be able to use it, but I have not used AWS before, so I may also be completely wrong about it...

mldiego commented 1 year ago

@ChristopherBrix

I found on the AWS AMI catalog some that may be useful for us if we can use those in the automated framework. Would something like this work?

aws_ec2_r2022b_ami

ChristopherBrix commented 1 year ago

I've added that one, please give it a try!

mldiego commented 1 year ago

Thank you!

mldiego commented 1 year ago

@ChristopherBrix

Could you also add this one: ami-02fbb965167e007cb ?

Would like to test the installation/execution process with both.

AMI info: R2022b matlab_linux ami-02fbb965167e007cb

ChristopherBrix commented 1 year ago

@mldiego Done!

ChristopherBrix commented 1 year ago

To give everyone enough time to adapt their tools to the submission system and support as many benchmarks as possible, we will extend the submission deadline to July 7 EOD AOE.

There will be no further extensions.

mldiego commented 1 year ago

@ChristopherBrix

I got some errors last night with the submissions (some server message, but don't remember now). I submitted it again this morning (this one is working well), and it looks like the previous 2 submissions are still pending (it says positions 2 and 3 in the queue, these numbers have not moved since last night). I am not sure if they are still in the queue, or it is just a website error, but just in case you can kill those if there are still there in the queue (submissions 837, 838).

jferlez commented 1 year ago

@ChristopherBrix

I likewise have two runs that are listed as queued, but are not advancing (numbers 839 and 840). You can terminate both, though, as I think I have fully debugged the deployment of my tool submission

ChristopherBrix commented 1 year ago

@mldiego @jferlez Thank you for letting me know - I've stopped those submissions. I'm not entirely sure why that happened, but tools submitted later were processed fine. I will monitor this, but please let me know if it happens again!

wu-haoze commented 1 year ago

@ChristopherBrix It seems that the dist_shift benchmarks is missing on the test web page. I'm wondering whether this can be fixed so we can test on that benchmark set? Thanks!

aliabigdeli commented 1 year ago

@ChristopherBrix I have two questions, first; is there any runtime cap on the VNNComp website? when I want to run all instances, it aborted after 1 hour in the middle of the executing the benchmarks. For example, in the run with id=877 , the running aborted after the first benchmark when total runtime reached "1 hours, 1 minutes", although I expected it to keep running on the other benchmarks. Secondly, I think there is a problem in vnnlib format of "collins_rul_cnn" benchmark; in the output constraints of this benchmark, there is (and(>= Y_0 438.0107727050781)) but I think there should be an space after 'and' so it should be like this (and (>= Y_0 438.0107727050781)); am I right? or we should support to parse that vnnlib format as well?

ChristopherBrix commented 1 year ago

@anwu1219 Thanks, I've added it.

@aliabigdeli Yes, the timeout for now was 1h. I've increased it to 12. Please don't run those tests too often, they get quite expensive if everyone does them repeatedly (3$ per hour).

I think the vnnlib file should be fixed, @regkirov

mldiego commented 1 year ago

@ChristopherBrix

I am getting the following error during the Initialization phase now:

1.vnnlib.gz': No such file or directory
mv: cannot move './benchmarks/cifar2020/vnnlib/cifar10_spec_idx_39_eps_0.00784_n1.vnnlib.gz' to '../../benchmarks/./benchmarks/cifar2020/vnnlib/cifar10_spec_idx_39_eps_0.00784_n1.vnnlib.gz': No such file or directory
 .
 .
 .
mv: cannot move './benchmarks/rl_benchmarks/vnnlib/dubinsrejoin_case_unsafe_59.vnnlib.gz' to '../../benchmarks/./benchmarks/rl_benchmarks/vnnlib/dubinsrejoin_case_unsafe_59.vnnlib.gz': No such file or directory
mv: cannot move './benchmarks/rl_benchmarks/vnnlib/cartpole_case_unsafe_13.vnnlib.gz' to '../../benchmarks/./benchmarks/rl_benchmarks/vnnlib/cartpole_case_unsafe_13.vnnlib.gz': No such file or directory
rm: cannot remove 'large_models': No such file or directory
rm: cannot remove 'large_models.zip': No such file or directory
gzip: benchmarks/*/onnx/*.gz: No such file or directory
gzip: benchmarks/*/vnnlib/*.gz: No such file or directory
benchmarks/vggnet16_2022/onnx/vgg16-7.onnx: No such file or directory
+ curl --retry 100 --retry-connrefused https://vnncomp.christopher-brix.de/update/977/failure
ChristopherBrix commented 1 year ago

@mldiego I'm looking into it.

ChristopherBrix commented 1 year ago

@mldiego I can see that you got this error, and this isn't something your code could have influenced - but if I start exactly the same submission once again, it initializes without issues.

Please try again and let me know if it continues to fail. Maybe either GitHub or the AWS instance experienced some temporary issue.

mldiego commented 1 year ago

@ChristopherBrix Yes, it is working now. Thanks for looking into it!

ChristopherBrix commented 1 year ago

All teams: You should have received an email from me (I send them to the email you used for your account on the submission page). Please double check that it lists the correct instance type, commit hash and benchmark list. If you spot any issues, please let me know.

If you did not receive an email, please contact me as soon as possible!