ShoumikSaha / DRSM

DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness
https://iclr.cc/virtual/2024/poster/17905
GNU General Public License v3.0
11 stars 3 forks source link

Some Questions about the Dataset #2

Open yinan17 opened 8 months ago

yinan17 commented 8 months ago

Hi Shoumik, I'm interested in the dataset in your study and have the following three questions: 1) Are there selection criteria for downloading malware data via virusshare.com? 2) Is the specific use of the malware and benign data 1:1 in your study? 3) How to use the data in the PACE Dataset?

ShoumikSaha commented 7 months ago

Hi,

  1. For the malware dataset, we used the Virusshare website. To be specific, we used the VirusShare_00434 folder for our malware dataset.
  2. Yes. Since our work was based on ML, we maintained a balanced dataset (malware and benign 1:1).
  3. I have added a Python script in the dataset folder. I hope that will help you.

Let me know is you have any other questions.