csce585-mlsystems / project-athena

This is the course project for CSCE585: ML Systems. Students will build their machine learning systems based on the provided infrastructure --- Athena.
MIT License
13 stars 19 forks source link

Completely insufficient background for Task 2 #24

Closed andrewwunderlich closed 3 years ago

andrewwunderlich commented 3 years ago

@MENG2010 @pooyanjamshidi

Once again, I feel completely unprepared to complete Task 2. I've looked at it for hours and I don't know how to get started. This appears to be the same situation we faced in Task 1, wherein we get thrown in the deep end of a swimming pool without being taught how to swim first. It is extremely stressful and frustrating, and honestly it makes me question if anyone is listening to our feedback in this class.

My team has chosen Option 1 because it appears to be the easiest option and the most similar to Task 1. The description of the assignment in the readme offers no helpful detail on how to actually complete the task. Instead, it points us to the Athena research paper section III.F, which also leaves me completely lost and in the dark about how to do this. That paper also cites two other research papers, claiming that it "combines" their methods. It gives a few equations which I do not understand and which have not been explained.

This is a 500-level class, not a 700/800-level one. Most of the students here are undergraduates or are graduate students coming from entirely separate fields. Unless you provide some significant help, most of us simply do not have the background to properly understand research papers whose intended audience was the adversarial ML research community. I'm asking you to please address these topics in class, or provide more detailed instructions about how to accomplish the task, or provide hints/tutorials, or do something to make this understandable.

raulcferraz commented 3 years ago

Unfortunately, I have to agree. And I would say that this is not how 700/800-level courses are like either. I'm very excited about this course and still want to explore more about the topic once it is over, but I absolutely don't feel like I'm the target audience.

I also disagree with the comment in class today that we don't really need to understand the attacks to understand the defense. At least for some methods, understanding both is absolutely essential. For example, PGD/PGD-ADT which was required in the first task. It is like understanding statistical assumptions underlying hypothesis testing. And because this is cutting-edge stuff, the related papers are mostly technical and dense, as Andrew mentioned - including the Athena one.

This is our first contact with these methods. I'm taking this class to learn them. If the class was designed for people who just need to get up to speed with the latest developments, then that was not made clear. However, even if that is the case, consider the following guide by Tensorflow developers: https://www.tensorflow.org/tutorials/generative/adversarial_fgsm According to themselves, this is targeted to experts. And yet, it is much easier to understand than what we were given to complete the project. That is because 1) the parameter of formulas are described in terms of the problem, and 2) by presenting the code piece by piece, we can write it as we build our own understanding, and then we can actually ask good questions. Had we slowly built Athena ourselves, I think we would not be so confused.

andrewwunderlich commented 3 years ago

I agree with Raul completely.

In an effort to make this a productive thread and not just a pure rant, I'm going to compile a short list of questions here which, if answered, might begin to clear up my own confusion. These questions are related to Option 1 for Task 2, since that is the one my team is working on.


  1. I am not sure how much of what we need to do for Option 1 is already implemented in Athena and how much we need to do ourselves. Ying posted on piazza a few days ago:

    Each individual options in task 2 is an advantage extension to the ATHENA project, in which you are expected to do some research and implement the approach on top of ATHENA. This is much more difficult than task 1 and the workload is heavier.

This suggests that the solution of Task 2 requires adding onto Athena rather than just using the existing provided methods to complete the task. However, the readme says for Option 1:

Possible solutions (already implemented in ATHENA):

  1. Xuanqing Liu, Minhao Cheng, Huan Zhang, Cho-Jui Hsieh. Towards Robust Neural Networks via Random Self-ensemble. ECCV 2018.
  2. Anish Athalye, Logan Engstrom, Andrew Ilyas, Kevin Kwok. Synthesizing Robust Adversarial Examples. ICML 2018

If these methods are already implemented in Athena, then we would not need to add onto Athena--we would just use the existing infrastructure. That would make sense, since the Athena research paper certainly suggests that white box attacks are already supported. So which is it? Are we just supposed to use existing methods? Or do we need to create our own new methods?


  1. If whitebox attacks are indeed already supported in Athena, where exactly is that code located? It looks to me like the Athena framework only uses attacks provided by ART, and really the framework only provides methods to pass data to/from ART and to define default parameter values for those not specified in the attack configs file. In particular, I ask because the Athena research paper discusses two ways to create the whitebox AEs: the greedy apporach and the optimization approach (which we are supposed to use). There is an algorithm given in the paper for the greedy approach. Where is this algorithm located in Athena? Likewise, there are equations (6) and (7) associated with the optimization approach. Where are these equations implemented in Athena? Are these things actually handled behind the scenes in ART? Or are we supposed to implement them ourselves? None of this is remotely clear, and I can't make any practical connections between the Athena paper and the provided code which help me understand what we're supposed to do for this assignment.
pooyanjamshidi commented 3 years ago

Thanks, @andrewwunderlich, and @raulcferraz for your suggestions, I agree with you! @andrewwunderlich your questions make sense. Let me try to provide an answer to both questions and @MENG2010 provides more details.

You do not need to add anything to ATHENA (you could if you want, but for T2-Option 1 you would not), rather implementing or reusing a white-box attack out there. We currently tried two types of white-box attacks as explained in the paper. The optimization-based is simply a dynamic attack targetted for ATHENA and these types of targetted attacks are common and are specific to the proposed defense to show how it performs against such strong targetted attacks that pretty much know everything about the defense and in those circumstances, the defense should break, but a good defense increase the cost of the attacker! Ying came up with such a targetted attack by combining EOT and RSE, basically, Eq (7) is the key idea behind this combined white-box attack. @MENG2010 will provide a tutorial showing you how you could implement Eq (7) and then you can use this to generate different types of attacks. I agree that you need to know how this equation implemented and how you can configure it to generate different variations of it.

MENG2010 commented 3 years ago

tutorials have been recorded and uploaded.